Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7008 Articles
article-image-development-tricks-unreal-engine-4
Packt
22 Jun 2016
39 min read
Save for later

Development Tricks with Unreal Engine 4

Packt
22 Jun 2016
39 min read
In this article by Benjamin Carnall, the author of Unreal Engine 4 by Example, we will look at some development tricks with Unreal Engine 4. (For more resources related to this topic, see here.) Creating the C++ world objects With the character constructed, we can now start to build the level. We are going to create a block out for the lanes that we will be using for the level. We can then use this block to construct a mesh that we can reference in code. Before we get into the level creation, we should ensure the functionality that we implemented for the character works as intended. With the BountyDashMap open, navigate to the C++ classes folder of the content browser. Here, you will be able to see the BountyDashCharacter. Drag and drop the character into the game level onto the platform. Then, search for TargetPoint in the Modes Panel. Drag and drop three of these target points into the game level, and you will be presented with the following: Now, press the Play button to enter the PIE (Play in Editor) mode. The character will be automatically possessed and used for input. Also, ensure that when you press A or D, the character moves to the next available target point. Now that we have the base of the character implemented, we should start to build the level. We require three lanes for the player to run down and obstacles for the player to dodge. For now, we must focus on the lanes that the player will be running on. Let's start by blocking out how the lanes will appear in the level. Drag a BSP box brush into the game world. You can find the Box brush in the Modes Panel under the BSP section, which is under the name called Box. Place the box at world location (0.0f, 0.0f, and -100.0f). This will place the box in the center of the world. Now, change the X Property of the box under the Brush settings section of the Details panel to 10000. We require this lane to be so long so that later on, we can hide the end using fog without obscuring the objects that the player will need to dodge. Next, we need to click and drag two more copies of this box. You can do this by holding Alt while moving an object via the transform widget. Position one box copy at world location (0.0f, -230.0f, -100) and the next at (0.0f, 230, -100). The last thing we need to do to finish blocking the level is place the Target Points in the center of each lane. You will be presented with this when you are done: Converting BSP brushes into a static mesh The next thing we need to do is convert the lane brushes we made into one mesh, so we can reference it within our code base. Select all of the boxes in the scene. You can do this by holding Ctrl while selecting the box brushes in the editor. With all of the brushes selected, address the Details panel. Ensure that the transformation of your selection is positioned in the middle of these three brushes. If it is not, you can either reselect the brushes in a different order, or you can group the brushes by pressing Ctrl + G while the boxes are selected. This is important as the position of the transform widget shows what the origin of the generated mesh will be. With the grouping or boxes selected, address the Brush Settings section in the Details Panel there is a small white expansion arrow at the bottom of the section; click on this now. You will then be presented with a create static mesh button; press this now. Name this mesh Floor_Mesh_BountyDash, and save it under the Geometry/Meshes/ of the content folder. Smoke and Mirrors with C++ objects We are going to create the illusion of movement within our level. You may have noticed that we have not included any facilities in our character to move forward in the game world. This is because our character will never advance past his X positon at 0. Instead, we are going to be moving the world toward and past him. This way, we can create very easy spawning and processing logic for the obstacles and game world, without having to worry about continuously spawning objects that the player can move past further and further down the X axis. We require some of the level assets to move through the world, so we can establish the illusion of movement for the character. One of these moving objects will be the floor. This requires some logic that will reposition floor meshes as they reach a certain depth behind the character. We will be creating a swap chain of sorts that will work with three meshes. The meshes will be positioned in a contiguous line. As the meshes move underneath and behind the player, we need to move any mesh that is far enough behind the player’s back to the front of the swap chain. The effect is a never-ending chain of floor meshes constantly flowing underneath the player. The following diagram may help to understand the concept: Obstacles and coin pickups will follow a similar logic. However, they will simply be destroyed upon reaching the Kill point in the preceding diagram. Modifying the BountyDashGameMode Before we start to create code classes that will feature in our world, we are going to modify the BountyDashGameMode that was generated when the project was created. The game mode is going to be responsible for all of the game state variables and rules. Later on, we are going to use the game mode to determine how the player respawns when the game is lost. BountyDashGameMode Class Definition The game mode is going to be fairly simple; we are going to a add a few member variables that will hold the current state of the game, such as game speed, game level, and the number of coins needed to increase the game speed. Navigate to BountyDashGameMode.h and add the following code: UCLASS(minimalapi) class ABountyDashGameMode : public AGameMode { GENERATED_BODY()   UPROPERTY() float gameSpeed;   UPROPERTY() int32 gameLevel; As you can see, we have two private member variables called gameSpeed and gameLevel. These are private, as we wish no other object to be able to modify the contents of these values. You will also note that the class has been specified with minimalapi. This specifier effectively informs the engine that other code modules will not need information from this object outside of the class type. This means you will be able to cast this class type, but functions cannot be called within other modules. This is specified as a way to optimize compile times, as no module outside of this project API will require interactions with our game mode. Next, we declare the public functions and members that we will be using within our game mode. Add the following code to the ABountyDashGameMode class definition: public: ABountyDashGameMode();       void CharScoreUp(unsigned int charScore);       UFUNCTION()     float GetInvGameSpeed();       UFUNCTION()     float GetGameSpeed();       UFUNCTION()     int32 GetGameLevel(); The function called CharScroreUp()takes in the player’s current score (held by the player) and changes game values based on this score. This means we are able to make the game more difficult as the player scores more points. The next three functions are simply the accessor methods that we can use to get the private data of this class in other objects. Next, we need to declare our protected members that we have exposed to be EditAnywhere, so we may adjust these from the editor for testing purposes: protected:   UPROPERTY(EditAnywhere, BlueprintReadOnly) int32 numCoinsForSpeedIncrease;   UPROPERTY(EditAnywhere, BlueprintReadWrite) float gameSpeedIncrease;   }; The numCoinsForSpeedIncrease variable will determine how many coins it takes to increase the speed of the game, and the gameSpeedIncrease value will determine how much faster the objects move when the numCoinsForSpeedIncrease value has been met. BountyDashGameMode Function Definitions Let's begin add some definitions to the BountyDashGameMode functions. They will be very simple at this point. Let's start by providing some default values for our member variables within the constructor and by assigning the class that is to be used for our default pawn. Add the definition for the ABountyDashGameMode constructor: ABountyDashGameMode::ABountyDashGameMode() {     // set default pawn class to our ABountyDashCharacter     DefaultPawnClass = ABountyDashCharacter::StaticClass();       numCoinsForSpeedIncrease = 5;     gameSpeed = 10.0f;     gameSpeedIncrease = 5.0f;     gameLevel = 1; } Here, we are setting the default pawn class; we are doing this by calling StaticClass() on the ABountyDashCharacter. As we have just referenced, the ABountyDashCharacter type ensures that #include BountyDashCharacter.h is added to the BountyDashGameMode.cpp include list. The StaticClass() function is provided by default for all objects, and it returns the class type information of the object as a UClass*. We then establish some default values for member variables. The player will have to pick up five coins to increase the level. The game speed is set to 10.0f (10m/s), and the game will speed up by 5.0f (5m/s) every time the coin quota is reached. Next, let's add a definition for the CharScoreUp() function: void ABountyDashGameMode::CharScoreUp(unsignedintcharScore) {     if (charScore != 0 &&        charScore % numCoinsForSpeedIncrease == 0)     {         gameSpeed += gameSpeedIncrease;         gameLevel++;     } } This function is quite self-explanatory. The character's current score is passed into the function. We then check whether the character's score is not currently 0, and we check if the remainder of our character score is 0 after being divided by the number of coins needed for a speed increase; that is, if it is divided equally, thus the quota has been reached. We then increase the game speed by the gameSpeedIncrease value and then increment the level. The last thing we need to add is the accessor methods described previously. They do not require too much explanation short of the GetInvGameSpeed() function. This function will be used by objects that wish to be pushed down the X axis at the game speed: float ABountyDashGameMode::GetInvGameSpeed() {     return -gameSpeed; }   float ABountyDashGameMode::GetGameSpeed() {     return gameSpeed; }   int32 ABountyDashGameMode::GetGameLevel() {     return gameLevel; } Getting our game mode via Template functions The ABountyDashGame mode now contains information and functionality that will be required by most of the BountyDash objects we create going forward. We need to create a light-weight method to retrieve our custom game mode, ensuring that the type information is preserved. We can do this by creating a template function that will take in a world context and return the correct game mode handle. Traditionally, we could just use a direct cast to ABountyDashGameMode; however, this would require including BountyDashGameMode.h in BountyDash.h. As not all of our objects will require the knowledge of the game mode, this is wasteful. Navigate to the BoutyDash.h file now. You will be presented with the following: #pragma once   #include "Engine.h" What currently exists in the file is very simple—#pragma once has again been used to ensure that the compiler only builds and includes the file once. Then Engine.h has been included, so every other object in BOUNTYDASH_API (they include BountyDash.h by default) has access to the functions within Engine.h. This is a good place to include utility functions that you wish all objects to have access to. In this file, include the following lines of code: template<typename T> T* GetCustomGameMode(UWorld* worldContext) {     return Cast<T>(worldContext->GetAuthGameMode()); } This code, simply put, is a template function that takes in a game world handle. It gets the game mode from this context via the GetAuthGameMode() function, and then casts this game mode to the template type provided to the function. We must cast to the template type as the GetAuthGameMode() simply returns a AGameMode handle. Now, with this in place, let's begin coding our never ending floor. Coding the floor The construction of the floor will be quite simple in essence, as we only need a few variables and a tick function to achieve the functionality we need. Use the class wizard to create a class named Floor that inherits from AActor. We will start by modifying the class definition found in Floor.h. Navigate to this file now. Floor Class Definition The class definition for the floor is very basic. All we need is a Tick function and some accessor methods, so we may provide some information about the floor to other objects. I have also removed the BeginPlay function provided by default by the class wizard as it is not needed. The following is what you will need to write for the AFloor class definition in its entirety, replace what is present in Floor.h with this now (keeping the #include list intact): UCLASS() class BOUNTYDASH_API AFloor : public AActor { GENERATED_BODY()     public:      // Sets default values for this actor's properties     AFloor();         // Called every frame     virtual void Tick( float DeltaSeconds ) override;       float GetKillPoint();     float GetSpawnPoint();   protected:     UPROPERTY(EditAnywhere)     TArray<USceneComponent*> FloorMeshScenes;       UPROPERTY(EditAnywhere)     TArray<UStaticMeshComponent*> FloorMeshes;       UPROPERTY(EditAnywhere)     UBoxComponent* CollisionBox;       int32 NumRepeatingMesh;     float KillPoint;     float SpawnPoint; }; We have three UPROPERTY declared members. The first two being TArrays that will hold handles to the USceneComponent and UMeshComponent objects that will make up the floor. We require the TArray of scene components as the USceneComponentobjects provide us with a world transform that we can apply translations to, so we may update the position of the generated floor mesh pieces. The last UPROPERTY is a collision box that will be used for the actual player collisions to prevent the player from falling through the moving floor. The reason we are using BoxComponent instead of the meshes for collision is because we do not want the player to translate with the moving meshes. Due to surface friction simulation, having the character to collide with any of the moving meshes will cause the player to move with the mesh. The last three members are protected and do not require any UPROPRTY specification. We are simply going to use the two float values—KillPoint and SpawnPoint—to save output calculations from the constructor, so we may use them in the Tick() function. The integer value called NumRepeatingMesh will be used to determine how many meshes we will have in the chain. Floor Function Definitions As always, we will start with the constructor of the floor. We will be performing the bulk of our calculations for this object here. We will be creating USceneComponents and UMeshComponents we are going to use to make up our moving floor. With dynamic programming in mind, we should establish the construction algorithm, so we can create any number of meshes in the moving line. Also, as we will be getting the speed of the floors movement form the game mode, ensure that #include “BountyDashGameMode.h” is included in Floor.cpp The AFloor::AFloor() constructor Start this by adding the following lines to the AFloor constructor called AFloor::AFloor(), which is found in Floor.cpp: RootComponent =CreateDefaultSubobject<USceneComponent>(TEXT("Root"));   ConstructorHelpers::FObjectFinder<UStaticMesh>myMesh(TEXT( "/Game/Barrel_Hopper/Geometry/Floor_Mesh_BountyDash.Floor_Mesh_BountyDash"));   ConstructorHelpers::FObjectFinder<UMaterial>myMaterial(TEXT( "/Game/StarterContent/Materials/M_Concrete_Tiles.M_Concrete_Tiles")); To start with, we are simply using FObjectFinders to find the assets that we require for the mesh. For the myMesh finder, ensure that you parse the reference location of the static floor mesh that we created earlier. We also created a scene component to be used as the root component for the floor object. Next, we are going to be checking the success of the mesh acquisition and then establishing some variables for the mesh placement logic: if (myMesh.Succeeded()) {     NumRepeatingMesh = 3;       FBoxSphereBounds myBounds = myMesh.Object->GetBounds();     float XBounds = myBounds.BoxExtent.X * 2;     float ScenePos = ((XBounds * (NumRepeatingMesh - 1)) / 2.0f) * -1;       KillPoint = ScenePos - (XBounds * 0.5f);     SpawnPoint = (ScenePos * -1) + (XBounds * 0.5f); Note that we have just opened an if statement without closing the scope; from time to time, I will split the segments of the code within a scope across multiple pages. If you are ever lost as to the current scope that we are working from, then look for this comment; <-- Closing If(MyMesh.Succed()) or in the future a similarly named comment. Firstly, we are initializing the NumRepeatingMesh value with 3. We are using a variable here instead of a hard coded value so that we may update the number of meshes in the chain without having to refactor the remaining code base. We then get the bounds of the mesh object using the function called GetBounds() on the mesh asset that we just retrieved. This returns the FBoxSphereBounds structure, which will provide you with all of the bounding information of a static mesh asset. We then use the X component of the member called BoxExtent to initialize Xbounds. BoxExtent is a vector that holds the extent of the bounding box of this mesh. We save the X component of this vector, so we can use it for mesh chain placement logic. We have doubled this value, as the BoxExtent vector only represents the extent of the box from origin to one corner of the mesh. Meaning, if we wish the total bounds of the mesh, we must double any of the BoxExtent components. Next, we calculate the initial scene positon of the first USceneCompoennt we will be attaching a mesh to and storing in the ScenePos array. We can determine this position by getting the total length of all of the meshes in the (XBounds * (numRepeatingMesh – 1) chain and then halve the resulting value, so we can get the distance the first SceneComponent that will be from origin along the X axis. We also multiply this value by -1 to make it negative, as we wish to start our mesh chain behind the character (at the X position 0). We then use ScenePos to specify killPoint, which represents the point in space at which floor mesh pieces should get to, before swapping back to the start of the chain. For the purposes the swap chain, whenever a scene component is half a mesh piece length behind the position of the first scene component in the chain, it should be moved to the other side of the chain. With all of our variables in place, we can now iterate through the number of meshes we desire (3) and create the appropriate components. Add the following code to the scope of the if statement that we just opened: for (int i = 0; i < NumRepeatingMesh; ++i) { // Initialize Scene FString SceneName = "Scene" + FString::FromInt(i); FName SceneID = FName(*SceneName); USceneComponent* thisScene = CreateDefaultSubobject<USceneComponent>(SceneID); check(thisScene);   thisScene->AttachTo(RootComponent); thisScene->SetRelativeLocation(FVector(ScenePos, 0.0f, 0.0f)); ScenePos += XBounds;   floorMeshScenes.Add(thisScene); Firstly, we are creating a name for the scene component by appending Scene with the iteration value that we are up too. We then convert this appended FString to FName and provide this to the CreateDefaultSubobject template function. With the resultant USceneComponent handle, we call AttachTo()to bind it to the root component. Then, we set the RelativeLocation of USceneComponent. Here, we are parsing in the ScenePos value that we calculated earlier as the Xcomponent of the new relative location. The relative location of this component will always be based off of the position of the root SceneComponent that we created earlier. With the USceneCompoennt appropriately placed, we increment the ScenePos value by that of the XBounds value. This will ensure that subsequent USceneComponents created in this loop will be placed an entire mesh length away from the previous one, forming a contiguous chain of meshes attached to scene components. Lastly, we add this new USceneComponent to floorMeshScenes, so we may later perform translations on the components. Next, we will construct the mesh components and add the following code to the loop: // Initialize Mesh FString MeshName = "Mesh" + FString::FromInt(i); UStaticMeshComponent* thisMesh = CreateDefaultSubobject<UStaticMeshComponent>(FName(*MeshName)); check(thisMesh);   thisMesh->AttachTo(FloorMeshScenes[i]); thisMesh->SetRelativeLocation(FVector(0.0f, 0.0f, 0.0f)); thisMesh->SetCollisionProfileName(TEXT("OverlapAllDynamic"));   if (myMaterial.Succeeded()) {     thisMesh->SetStaticMesh(myMesh.Object);     thisMesh->SetMaterial(0, myMaterial.Object); }   FloorMeshes.Add(thisMesh); } // <--Closing For(int i = 0; i < numReapeatingMesh; ++i) As you can see, we have performed a similar name creation process for the UMeshComponents as we did for the USceneComponents. The preceding construction process was quite simple. We attach the mesh to the scene component so the mesh will follow any translation that we apply to the parent USceneComponent. We then ensure that the mesh's origin will be centered around the USceneComponent by setting its relative location to be (0.0f, 0.0f, 0.0f). We then ensure that meshes do not collide with anything in the game world; we do so with the SetCollisionProfileName() function. If you remember, when we used this function earlier, we provided a profile name that we wish edthe object to use the collision properties from. In our case, we wish this mesh to overlap all dynamic objects, thus we parse OverlapAllDynamic. Without this line of code, the character may collide with the moving floor meshes, and this will drag the player along at the same speed, thus breaking the illusion of motion we are trying to create. Lastly, we assign the static mesh object and material we obtained earlier with the FObjectFinders. We ensure that we add this new mesh object to the FloorMeshes array in case we need it later. We also close the loop scope that we created earlier. The next thing we are going to do is create the collision box that will be used for character collisions. With the box set to collide with everything and the meshes set to overlap everything, we will be able to collide on the stationary box while the meshes whip past under our feet. The following code will create a box collider: collisionBox =CreateDefaultSubobject<UBoxComponent>(TEXT("CollsionBox")); check(collisionBox);   collisionBox->AttachTo(RootComponent); collisionBox->SetBoxExtent(FVector(spawnPoint, myBounds.BoxExtent.Y, myBounds.BoxExtent.Z)); collisionBox->SetCollisionProfileName(TEXT("BlockAllDynamic"));   } // <-- Closing if(myMesh.Succeeded()) As you can see, we initialize UBoxComponent as we always initialize components. We then attach the box to the root component as we do not wish it to move. We also set the box extent to be that of the length of the entire swap chain by setting the spawnPoint value as the X bounds of the collider. We set the collision profile to BlockAllDynamic. This means it will block any dynamic actor, such as our character. Note that we have also closed the scope of the if statement opened earlier. With the constructor definition finished, we might as well define the accessor methods for spawnPoint and killPoint before we move onto theTick() function: float AFloor::GetKillPoint() {     return KillPoint; }   float AFloor::GetSpawnPoint() {     return SpawnPoint; } AFloor::Tick() Now, it is time to write the function that will move the meshes and ensure that they move back to the start of the chain when they reach KillPoint. Add the following code to the Tick() function found in Floor.cpp: for (auto Scene : FloorMeshScenes) { Scene->AddLocalOffset(FVector(GetCustomGameMode <ABountyDashGameMode>(GetWorld())->GetInvGameSpeed(), 0.0f, 0.0f));   if (Scene->GetComponentTransform().GetLocation().X <= KillPoint) {     Scene->SetRelativeLocation(FVector(SpawnPoint, 0.0f, 0.0f)); } } Here we use a C++ 11 range-based for the loop. Meaning that for each element inside of FloorMeshScenes, it will populate the scene handle of type auto with a pointer to whatever type is contained by FloorMeshScenes; in this case, it is USceneComponent*. For every scene component contained within FloorMeshScenes, we add a local offset to each frame. The amount we offset each frame is dependent on the current game speed. We are getting the game speed from the game mode via the template function that we wrote earlier. As you can see, we have specified the template function to be of the ABountyDashGameMode type, thus we will have access to the bounty dash game mode functionality. We have done this so that the floor will move faster under the player's feet as the speed of the game increases. The next thing we do is check the X value of the scene component’s location. If this value is equal to or less than the value stored in KillPoint, we reposition the scene component back to the spawn point. As we attached the meshes to USceenComponents earlier, the meshes will also translate with the scene components. Lastly, ensure that you have added #include BountyDashGameMode.h to .cpp include the list. Placing the Floor in the level! We are done making the floor! Compile the code and return to the level editor. We can now place this new floor object in the level! Delete the static mesh that would have replaced our earlier box brushes, and drag and drop the Floor object into the scene. The floor object can be found under the C++ classes folder of the content browser. Select the floor in the level, and ensure that its location is set too (0.0f, 0.0f, and -100.f). This will place the floor just below the player's feet around origin. Also, ensure that the ATargetPoints that we placed earlier are in the right positions above the lanes. With all this in place, you should be able to press play and observe the floor moving underneath the player indefinitely. You will see something similar to this: You will notice that as you move between the lanes by pressing A and D, the player maintains the X position of the target points but nicely travels to the center of each lane. Creating the obstacles The next step for this project is to create the obstacles that will come flying at the player. These obstacles are going to be very simple and contain only a few members and functions. These obstacles are to only used to serve as a blockade for the player, and all of the collision with the obstacles will be handled by the player itself. Use the class wizard to create a new class named Obstacle, and inherit this object from AActor. Once the class has been generated, modify the class definition found in Obstacle.h so that it appears as follows: UCLASS(BlueprintType) class BOUNTYDASH_API AObstacle: public AActor { GENERATED_BODY()         float KillPoint;   public:      // Sets default values for this actor's properties     AObstacle ();       // Called when the game starts or when spawned     virtual void BeginPlay() override;         // Called every frame     virtual void Tick( float DeltaSeconds ) override;     void SetKillPoint(float point); float GetKillPoint();   protected:     UFUNCITON()     virtual void MyOnActorOverlap(AActor* otherActor);       UFUNCTION()     virtual void MyOnActorEndOverlap(AActor* otherActor);     public: UPROPERTY(EditAnywhere, BlueprintReadWrite)     USphereComponent* Collider;       UPROPERTY(EditAnywhere, BlueprintReadWrite)     UStaticMeshComponent* Mesh; }; You will notice that the class has been declared with the BlueprintType specifier! This object is simple enough to justify extension into blueprint, as there is no new learning to be found within this simple object, and we can use blueprint for convenience. For this class, we have added a private member called KillPoint that will be used to determine when AObstacle should destroy itself. We have also added the accessor methods for this private member. You will notice that we have added the MyActorBeginOverlap and MyActorEndOverlap functions that we will provide appropriate delegates, so we can provide custom collision response for this object. The definitions of these functions are not too complicated either. Ensure that you have included #include BountyDashGameMode.h in Obstacle.cpp. Then, we can begin filling out our function definitions; the following is code what we will use for the constructor: AObstacle::AObstacle() { PrimaryActorTick.bCanEverTick = true;   Collider = CreateDefaultSubobject<USphereComponent>(TEXT("Collider")); check(Collider);   RootComponent = Collider; Collider ->SetCollisionProfileName("OverlapAllDynamic");   Mesh = CreateDefaultSubobject<UStaticMeshComponent>(TEXT("Mesh")); check(Mesh); Mesh ->AttachTo(Collider); Mesh ->SetCollisionResponseToAllChannels(ECR_Ignore); KillPoint = -20000.0f;   OnActorBeginOverlap.AddDynamic(this, &AObstacle::MyOnActorOverlap); OnActorBeginOverlap.AddDynamic(this, &AObstacle::MyOnActorEndOverlap); } The only thing of note within this constructor is that again, we set the mesh of this object to ignore the collision response to all the channels; this means that the mesh will not affect collision in any way. We have also initialized kill point with a default value of -20000.0f. Following that we are binding the custom the MyOnActorOverlap and MyOnActorEndOverlap functions to appropriate delegates. The Tick() function of this object is responsible for translating the obstacle during play. Add the following code to the Tick function of AObstacle: void AObstacle::Tick( float DeltaTime ) {     Super::Tick( DeltaTime ); float gameSpeed = GetCustomGameMode<ABountyDashGameMode>(GetWorld())-> GetInvGameSpeed();       AddActorLocalOffset(FVector(gameSpeed, 0.0f, 0.0f));       if (GetActorLocation().X < KillPoint)     {         Destroy();     } } As you can see the tick function will add an offset to AObstacle each frame along the X axis via the AddActorLocalOffset function. The value of the offset is determined by the game speed set in the game mode; again, we are using the template function that we created earlier to get the game mode to call GetInvGameSpeed(). AObstacle is also responsible for its own destruction; upon reaching a maximum bounds defined by killPoint, the AObstacle will destroy itself. The last thing to we need to add is the function definitions for the OnOverlap functions the and KillPoint accessors: void AObstacle::MyOnActorOverlap(AActor* otherActor) {     } void AObstacle::MyOnActorEndOverlap(AActor* otherActor) {   }   void AObstacle::SetKillPoint(float point) {     killPoint = point; }   float AObstacle::GetKillPoint() {     return killPoint; } Now, let's abstract this class into blueprint. Compile the code and go back to the game editor. Within the content folder, create a new blueprint object that inherits form the Obstacle class that we just made, and name it RockObstacleBP. Within this blueprint, we need to make some adjustments. Select the collider component that we created, and expand the shape sections in the Details panel. Change the Sphere radius property to 100.0f. Next, select the mesh component and expand the Static Mesh section. From the provided drop-down menu, choose the SM_Rock mesh. Next, expand the transform section of the Mesh component details panel and match these values:   You should end up with an object that looks similar to this:   Spawning Actors from C++ Despite the obstacles being fairly easy to implement from a C++ standpoint, the complication usually comes from the spawning system that we will be using to create these objects in the game. We will leverage a similar system to the player's movement by basing the spawn locations off of the ATargetPoints that are already in the scene. We can then randomly select a spawn target when we require a new object to spawn. Open the class wizard now, and create a class that inherits from Actor and call it ObstacleSpawner. We inherit from AActor as even though this object does not have a physical presence in the scene, we still require ObstacleSpawner to tick. The first issue we are going to encounter is that our current target points give us a good indication of the Y positon for our spawns, but the X position is centered around the origin. This is undesirable for the obstacle spawn point, as we would like to spawn these objects a fair distance away from the player, so we can do two things. One, obscure the popping of spawning the objects via fog, and two, present the player with enough obstacle information so that they may dodge them at high speeds. This means we are going to require some information from our floor object; we can use the KillPoint and SpawnPoint members of the floor to determine the spawn location and kill the location of the Obstacles. Obstacle Spawner Class definition This will be another fairly simple object. It will require a BeginPlay function, so we may find the floor and all the target points that we require for spawning. We also require a Tick function so that we may process spawning logic on a per frame basis. Thankfully, both of these are provided by default by the class wizard. We have created a protected SpawnObstacle() function, so we may group that functionality together. We are also going to require a few UPRORERTY declared members that can be edited from the Level editor. We need a list of obstacle types to spawn; we can then randomly select one of the types each time we spawn an obstacle. We also require the spawn targets (though, we will be populating these upon beginning the play). Finally, we will need a spawn time that we can set for the interval between obstacles spawning. To accommodate for all of this, navigate to ObstacleSpawner.h now and modify the class definition to match the following: UCLASS() class BOUNTYDASH_API AObstacleSpawner : public AActor {     GENERATED_BODY()     public:     // Sets default values for this actor's properties     AObstacleSpawner();       // Called when the game starts or when spawned     virtual void BeginPlay() override;         // Called every frame     virtual void Tick( float DeltaSeconds ) override;     protected:       void SpawnObstacle();   public:     UPROPERTY(EditAnywhere, BlueprintReadWrite)     TArray<TSubclassof<class AObstacle*>> ObstaclesToSpawn;       UPROPERTY()     TArray<class ATargetPoint*>SpawnTargets;       UPROPERTY(EditAnywhere, BlueprintReadWrite)     float SpawnTimer;       UPROPERTY()     USceneComponent* scene; private:     float KillPoint;     float SpawnPoint;    float TimeSinceLastSpawn; }; I have again used TArrays for our containers of obstacle objects and spawn targets. As you can see, the obstacle list is of type TSubclassof<class AObstacle>>. This means that the objects in TArray will be class types that inherit from AObscatle. This is very useful as not only will we be able to use these array elements for spawn information, but the engine will also filter our search when we will be adding object types to this array from the editor. With these class types, we will be able to spawn objects that inherit from AObject (including blueprints) when required. We have also included a scene object, so we can arbitrarily place AObstacleSpawner in the level somewhere, and we can place two private members that will hold the kill and the spawn point of the objects. The last element is a float timer that will be used to gauge how much time has passed since the last obstacle spawn. Obstacle Spawner function definitions Okay now, we can create the body of the AObstacleSpawner object. Before we do so, ensure to include the list in ObstacleSpawner.cpp as follows: #include "BountyDash.h" #include "BountyDashGameMode.h" #include "Engine/TargetPoint.h" #include “Floor.h” #include “Obstacle.h” #include "ObstacleSpawner.h"   Following this we have a very simple constructor that establishes the root scene component: // Sets default values AObstacleSpawner::AObstacleSpawner() { // Set this actor to call Tick() every frame.  You can turn this off to improve performance if you don't need it. PrimaryActorTick.bCanEverTick = true;   Scene = CreateDefaultSubobject<USceneComponent>(TEXT("Root")); check(Scene); RootComponent = scene;   SpawnTimer = 1.5f; } Following the constructor, we have BeginPlay(). Inside this function, we are going to do a few things. Firstly, we are simply performing the same in the level object retrieval that we executed in ABountyDashCarhacter to get the location of ATargetPoints. However, this object also requires information from the floor object in the level. We are also going to get the floor object the same way we did with the ATargetPoints by utilizing TActorIterators. We will then get the required kill and spawn the point information. We will also set TimeSinceLastSpawn to SpawnTimer so that we begin spawning objects instantaneously: // Called when the game starts or when spawned void AObstacleSpawner::BeginPlay() {     Super::BeginPlay();   for(TActorIterator<ATargetPoint> TargetIter(GetWorld()); TargetIter;     ++TargetIter)     {         SpawnTargets.Add(*TargetIter);     }   for (TActorIterator<AFloor> FloorIter(GetWorld()); FloorIter;         ++FloorIter)     {         if (FloorIter->GetWorld() == GetWorld())         {             KillPoint = FloorIter->GetKillPoint();             SpawnPoint = FloorIter->GetSpawnPoint();         }     }     TimeSinceLastSpawn = SpawnTimer; } The next function we will understand in detail is Tick(), which is responsible for the bulk of the AObstacleSpawner functionality. Within this function, we need to check if we require a new object to be spawned based off of the amount of time that has passed since we last spawned an object. Add the following code to AObstacleSpawner::Tick() underneath Super::Tick(): TimeSinceLastSpawn += DeltaTime;   float trueSpawnTime = spawnTime / (float)GetCustomGameMode <ABountyDashGameMode>(GetWorld())->GetGameLevel();   if (TimeSinceLastSpawn > trueSpawnTime) {     timeSinceLastSpawn = 0.0f;     SpawnObstacle (); } Here, we are accumulating the delta time in TimeSinceLastSpawn, so we may gauge how much real time has passed since the last obstacle was spawned. We then calculate the trueSpawnTime of the AObstacleSpawner. This is based off of a base SpawnTime, which is divided by the current game level retrieved from the game mode via the GetCustomGamMode() template function. This means that as the game level increases and the obstacles begin to move faster, the obstacle spawner will also spawn objects at a faster rate. If the accumulated timeSinceLastSpawn is greater than the calculated trueSpawnTime, we need to call SpawnObject() and reset the timeSinceLastSpawn timer to 0.0f. Getting information from components in C++ Now, we need to write the spawn function. This spawn function is going to have to retrieve some information from the components of the object that is being spawned. As we have allowed our AObstacle class to be extended into blueprint, we have also exposed the object to a level of versatility that we must compensate for in the codebase. With the ability to customize the mesh and bounds of the Sphere Collider that makes up any given obstacle, we must be sure to spawn the obstacle in the right place regardless of size! To do this, we are going to need to obtain information form the components contained within the spawned AObstacle class. This can be done via GetComponentByClass(). It will take the UClass* of the component you wish to retrieve, and it will return a handle to the component if it has been found. We can then cast this handle to the appropriate type and retrieve the information that we require! Let's begin detailing the spawn function; add the following code to ObstacleSpawner.cpp: void AObstacleSpawner::SpawnObstacle() { if (SpawnTargets.Num() > 0 && ObstaclesToSpawn.Num() > 0) { short Spawner = Fmath::Rand() % SpawnTargets.Num();     short Obstical = Fmath::Rand() % ObstaclesToSpawn.Num();     float CapsuleOffset = 0.0f; Here, we ensure that both of the arrays have been populated with at least one valid member. We then generate the random lookup integers that we will use to access the SpawnTargets and obstacleToSpawn arrays. This means that every time we spawn an object, both the lane spawned in and the type of the object will be randomized. We do this by generating a random value with FMath::Rand(), and then we find the remainder of this number divided by the number of elements in the corresponding array. The result will be a random number that exists between zero and the number of objects in either array minus one which is perfect for our needs. Continue by adding the following code: FActorSpawnParameters SpawnInfo;   FTransform myTrans = SpawnTargets[Spawner]->GetTransform(); myTrans.SetLocation(FVector(SpawnPoint, myTrans.GetLocation().Y, myTrans.GetLocation().Z)); Here, we are using a struct called FActorSpawnParameters. The default values of this struct are fine for our purposes. We will soon be parsing this struct to a function in our world context. After this, we create a transform that we will be providing to the world context as well. The transform of the spawner will suffice apart from the X component of the location. We need to adjust this so that the X value of the spawn transform matches the spawn point that we retrieved from the floor. We do this by setting the X component of the spawn transforms location to be the spawnPoint value that we received earlier, and w make sure that the other components of the location vector to be the Y and Z components of the current location. The next thing we must do is actually spawn the object! We are going to utilize a template function called SpawnActor() that can be called form the UWorld* handle returned by GetWorld(). This function will spawn an object of a specified type in the game world at a specified location. The type of the object is determined by passing the UClass* handle that holds the object type we wish to spawn. The transform and spawn parameters of the object are also determined by the corresponding input parameters of SpawnActor(). The template type of the function will dictate the type of object that is spawned and the handle that is returned from the function. In our case, we require AObstacle to be spawned. Add the following code to the SpawnObstacle function: AObstacle* newObs = GetWorld()-> SpawnActor<AObstacle>(obstacleToSpawn[Obstical, myTrans, SpawnInfo);   if(newObs) { newObs->SetKillPoint(KillPoint); As you can see we are using SpawnActor() with a template type of AObstacle. We use the random look up integer we generated before to retrieve the class type from the obstacleToSpawn array. We also provide the transform and spawn parameters we created earlier to SpawnActor(). If the new AObstacle was created successfully we then save the return of this function into an AObstacle handle that we will use to set the kill point of the obstacle via SetKillPoint(). We must now adjust the height of this object. The object will more than likely spawn in the ground in its current state. We need to get access to the sphere component of the obstacle so we may get the radius of this capsule and adjust the positon of the obstacle so it sits above the ground. We can use the capsule as a reliable resource as it is the root component of the Obstacle thus we can move the Obstacle entirely out of the ground if we assume the base of the sphere will line up with the base of the mesh. Add the following code to the SpawnObstacle() function: USphereComponent* obsSphere = Cast<USphereComponent> (newObs->GetComponentByClass(USphereComponent::StaticClass()));   if (obsSphere) { newObs->AddActorLocalOffset(FVector(0.0f, 0.0f, obsSphere-> GetUnscaledSphereRadius())); } }//<-- Closing if(newObs) }//<-- Closing if(SpawnTargets.Num() > 0 && obstacleToSpawn.Num() > 0) Here, we are getting the sphere component out of the newObs handle that we obtained from SpawnActor() via GetComponentByClass(), which was mentioned previously. We pass the class type of USphereComponent via the static function called StaticClass() to the function. This will return a valid handle if newObs does indeed contain USphereComponent (which we know it does). We then cast the result of this function to USphereComponent*, and save it in the obsSphere handle. We ensure that this handle is valid; if it is, we can then offset the actor that we just spawned on the Z axis by the unscaled radius of the sphere component. This will result in all the obstacles spawned be in line with the top of the floor! Ensuring the obstacle spawner works Okay, now is the time to bring the obstacle spawner into the scene. Be sure to compile the code, then navigate to the C++ classes folder of the content browser. From here, drag and drop ObstacleSpawner into the scene. Select the new ObstacleSpawner via the World Outlier, and address the Details panel. You will see the exposed members under the ObstacleSpawner section like so: Now, to add RockObstacleBP that we made earlier to the ObstacleToSpawn array, press the small white plus next to the property in the Details panel; this will add an element to TArray that you will then be able to customize. Select the drop-down menu that currently says None. Within this menu, search for RockObstacleBP and select it. If you wish to create and add more obstacle types to this array, feel free to do so. We do not need to add any members to the Spawn Targets property, as this will happen automatically. Now, press Play and behold a legion of moving rocks. Summary This article gives an overview about various development tricks associated with Unreal Engine 4. Resources for Article: Further resources on this subject: Special Effects [article] Bang Bang – Let's Make It Explode [article] The Game World [article]
Read more
  • 0
  • 0
  • 46434

article-image-adding-media-our-site
Packt
21 Jun 2016
19 min read
Save for later

Adding Media to Our Site

Packt
21 Jun 2016
19 min read
In this article by Neeraj Kumar et al, authors of the book, Drupal 8 Development Beginner's Guide - Second Edtion, explains a text-only site is not going to hold the interest of visitors; a site needs some pizzazz and some spice! One way to add some pizzazz to your site is by adding some multimedia content, such as images, video, audio, and so on. But, we don't just want to add a few images here and there; in fact, we want an immersive and compelling multimedia experience that is easy to manage, configure, and extend. The File entity (https://drupal.org/project/file_entity) module for Drupal 8 will enable us to manage files very easily. In this article, we will discover how to integrate the File entity module to add images to our d8dev site, and will explore compelling ways to present images to users. This will include taking a look at the integration of a lightbox-type UI element for displaying the File-entity-module-managed images, and learning how we can create custom image styles through UI and code. The following topics will be covered in this article: The File entity module for Drupal 8 Adding a Recipe image field to your content types Code example—image styles for Drupal 8 Displaying recipe images in a lightbox popup Working with Drupal issue queues (For more resources related to this topic, see here.) Introduction to the File entity module As per the module page at https://www.drupal.org/project/file_entity: File entity provides interfaces for managing files. It also extends the core file entity, allowing files to be fieldable, grouped into types, viewed (using display modes) and formatted using field formatters. File entity integrates with a number of modules, exposing files to Views, Entity API, Token and more. In our case, we need this module to easily edit image properties such as Title text and Alt text. So these properties will be used in the colorbox popup to display them as captions. Working with dev versions of modules There are times when you come across a module that introduces some major new features and is fairly stable, but not quite ready for use on a live/production website, and is therefore available only as a dev version. This is a perfect opportunity to provide a valuable contribution to the Drupal community. Just by installing and using a dev version of a module (in your local development environment, of course), you are providing valuable testing for the module maintainers. Of course, you should enter an issue in the project's issue queue if you discover any bugs or would like to request any additional features. Also, using a dev version of a module presents you with the opportunity to take on some custom Drupal development. However, it is important that you remember that a module is released as a dev version for a reason, and it is most likely not stable enough to be deployed on a public-facing site. Our use of the File entity module in this article is a good example of working with the dev version of a module. One thing to note: Drush will download official and dev module releases. But at this point in time, there is no official port for the File entity module in Drupal, so we will use the unofficial one, which lives on GitHub (https://github.com/drupal-media/file_entity). In the next step, we will be downloading the dev release with GitHub. Time for action – installing a dev version of the File entity module In Drupal, we use Drush to download and enable any module/theme, but there is no official port yet for the file entity module in Drupal, so we can use the unofficial one, which lives on GitHub at https://github.com/drupal-media/file_entity: Open the Terminal (Mac OS X) or Command Prompt (Windows) application, and go to the root directory of your d8dev site. Go inside the modules folder and download the File entity module from GitHub. We use the git command to download this module: $ git clone https://github.com/drupal-media/file_entity. Another way is to download a .zip file from https://github.com/drupal-media/file_entity and extract it in the modules folder: Next, on the Extend page (admin/modules), enable the File entity module. What just happened? We enabled the File entity module, and learned how to download and install with GitHub. A new recipe for our site In this article, we are going to create a new recipe: Thai Basil Chicken. If you would like to have more real content to use as an example, and feel free to try the recipe out! Name: Thai Basil Chicken Description: A spicy, flavorful version of one of my favorite Thai dishes RecipeYield : Four servings PrepTime: 25 minutes CookTime: 20 minutes Ingredients : One pound boneless chicken breasts Two tablespoons of olive oil Four garlic cloves, minced Three tablespoons of soy sauce Two tablespoons of fish sauce Two large sweet onions, sliced Five cloves of garlic One yellow bell pepper One green bell pepper Four to eight Thai peppers (depending on the level of hotness you want) One-third cup of dark brown sugar dissolved in one cup of hot water One cup of fresh basil leaves Two cups of Jasmin rice Instructions: Prepare the Jasmine rice according to the directions. Heat the olive oil in a large frying pan over medium heat for two minutes. Add the chicken to the pan and then pour on soy sauce. Cook the chicken until there is no visible pinkness—approximately 8 to 10 minutes. Reduce heat to medium low. Add the garlic and fish sauce, and simmer for 3 minutes. Next, add the Thai chilies, onion, and bell pepper and stir to combine. Simmer for 2 minutes. Add the brown sugar and water mixture. Stir to mix, and then cover. Simmer for 5 minutes. Uncover, add basil, and stir to combine. Serve over rice. Time for action – adding a Recipe images field to our Recipe content type We will use the manage fields administrative page to add a Media field to our d8dev Recipe content type: Open up the d8dev site in your favorite browser, click on the Structure link in the Admin toolbar, and then click on the Content types link. Next, on the Content types administrative page, click on the Manage fields link for your Recipe content type: Now, on the Manage fields administrative page, click on the Add field link. On the next screen, select Image from the Add a new field dropdown and Label as Recipe images. Click on the Save field settings button. Next, on the Field settings page, select Unlimited as the allowed number of values. Click on the Save field settings button. On the next screen, leave all settings as they are and click on the Save settings button. Next, on the Manage form display page, select widget Editable file for the Recipe images field and click on the Save button. Now, on the Manage display page, for the Recipe images field, select Hidden as the label. Click on the settings icon. Then select Medium (220*220) as the image style, and click on the Update button. At the bottom, click on the Save button: Let's add some Recipe images to a recipe. Click on the Content link in the menu bar, and then click on Add content and Recipe. On the next screen, fill in the title as Thai Basil Chicken and other fields respectively as mentioned in the preceding recipe details. Now, scroll down to the new Recipe images field that you have added. Click on the Add a new file button or drag and drop images that you want to upload. Then click on the Save and Publish button: Reload your Thai Basil Chicken recipe page, and you should see something similar to the following: All the images are stacked on top of each other. So, we will add the following CSS just under the style for field--name-field-recipe-images and field--type-recipe-images in the /modules/d8dev/styles/d8dev.css file, to lay out the Recipe images in more of a grid: .node .field--type-recipe-images { float: none !important; } .field--name-field-recipe-images .field__item { display: inline-flex; padding: 6px; } Now we will load this d8dev.css file to affect this grid style. In Drupal 8, loading a CSS file has a process: Save the CSS to a file. Define a library, which can contain the CSS file. Attach the library to a render array in a hook. So, we have already saved a CSS file called d8dev.css under the styles folder; now we will define a library. To define one or more (asset) libraries, add a *.libraries.yml file to your theme folder. Our module is named d8dev, and then the filename should be d8dev.libraries.yml. Each library in the file is an entry detailing CSS, like this: d8dev: version: 1.x css: theme: styles/d8dev.css: {} Now, we define the hook_page_attachments() function to load the CSS file. Add the following code inside the d8dev.module file. Use this hook when you want to conditionally add attachments to a page: /** * Implements hook_page_attachments(). */ function d8dev_page_attachments(array &$attachments) { $attachments['#attached']['library'][] = 'd8dev/d8dev'; } Now, we will need to clear the cache for our d8dev site by going to Configuration, clicking on the Performance link, and then clicking on the Clear all caches button. Reload your Thai Basil Chicken recipe page, and you should see something similar to the following: What just happened? We added and configured a media-based field for our Recipe content type. We updated the d8dev module with custom CSS code to lay out the Recipe images in more of a grid format. And also we looked at how to attach a CSS file through a module. Creating a custom image style Before we configure a colorbox feature, we are going to create a custom image style to use when we add them in colorbox content preview settings. Image styles for Drupal 8 are part of the core Image module. The core image module provides three default image styles—thumbnail, medium, and large—as seen in the following Image style configuration page: Now, we are going to add a fifth custom image style, an image style that will resize our images somewhere between the 100 x 75 thumbnail style and the 220 x 165 medium style. We will walkthrough the process of creating an image style through the Image style administrative page, and also walkthrough the process of programmatically creating an image style. Time for action – adding a custom image style through the image style administrative page First, we will use the Image style administrative page (admin/config/media/image-styles) to create a custom image style: Open the d8dev site in your favorite browser, click on the Configuration link in the Admin toolbar, and click on the Image styles link under the Media section. Once the Image styles administrative page has loaded, click on the Add style link. Next, enter small for the Image style name of your custom image style, and click on the Create new style button: Now, we will add the one and only effect for our custom image style by selecting Scale from the EFFECT options and then clicking on the Add button. On the Add Scale effect page, enter 160 for the width and 120 for the height. Leave the Allow Upscaling checkbox unchecked, and click on the Add effect button: Finally, just click on the Update style button on the Edit small style administrative page, and we are done. We now have a new custom small image style that we will be able to use to resize images for our site: What just happened? We learned how easy it is to add a custom image style with the administrative UI. Now, we are going to see how to add a custom image style by writing some code. The advantage of having code-based custom image styles is that it will allow us to utilize a source code repository, such as Git, to manage and deploy our custom image styles between different environments. For example, it would allow us to use Git to promote image styles from our development environment to a live production website. Otherwise, the manual configuration that we just did would have to be repeated for every environment. Time for action – creating a programmatic custom image style Now, we will see how we can add a custom image style with code: The first thing we need to do is delete the small image style that we just created. So, open your d8dev site in your favorite browser, click on the Configuration link in the Admin toolbar, and then click on the Image styles link under the Media section. Once the Image styles administrative page has loaded, click on the delete link for the small image style that we just added. Next, on the Optionally select a style before deleting small page, leave the default value for the Replacement style select list as No replacement, just delete, and click on the Delete button: In Drupal 8, image styles have been converted from an array to an object that extends ConfigEntity. All image styles provided by modules need to be defined as YAML configuration files in the config/install folder of each module. Suppose our module is located at modules/d8dev. Create a file called modules/d8dev/config/install/image.style.small.yml with the following content: uuid: b97a0bd7-4833-4d4a-ae05-5d4da0503041 langcode: en status: true dependencies: { } name: small label: small effects: c76016aa-3c8b-495a-9e31-4923f1e4be54: uuid: c76016aa-3c8b-495a-9e31-4923f1e4be54 id: image_scale weight: 1 data: width: 160 height: 120 upscale: false We need to use a UUID generator to assign unique IDs to image style effects. Do not copy/paste UUIDs from other pieces of code or from other image styles! The name of our custom style is small, is provided as the name and label as same. For each effect that we want to add to our image style, we will specify the effect we want to use as the name key, and then pass in values as the settings for the effect. In the case of the image_scale effect that we are using here, we pass in the width, height, and upscale settings. Finally, the value for the weight key allows us to specify the order the effects should be processed in, and although it is not very useful when there is only one effect, it becomes important when there are multiple effects. Now, we will need to uninstall and install our d8dev module by going to the Extend page. On the next screen click on the Uninstall tab, check the d8dev checkbox and click on the Uninstall button. Now, click on the List tab, check d8dev, and click on the Install button. Then, go back to the Image styles administrative page and you will see our programmatically created small image style. What just happened? We created a custom image style with some custom code. We then configured our Recipe content type to use our custom image style for images added to the Recipe images field. Integrating the Colorbox and File entity modules The File entity module provides interfaces for managing files. For images, we will be able to edit Title text, Alt text, and Filenames easily. However, the images are taking up quite a bit of room. Let's create a pop-up lightbox gallery and show images in a popup. When someone clicks on an image, a lightbox will pop up and allow the user to cycle through larger versions of all associated images. Time for action – installing the Colorbox module Before we can display Recipe images in a Colorbox, we need to download and enable the module: Open the Mac OS X Terminal or Windows Command Prompt, and change to the d8dev directory. Next, use Drush to download and enable the current dev release of the Colorbox module (http://drupal.org/project/colorbox): $ drush dl colorbox-8.x-1.x-dev Project colorbox (8.x-1.x-dev) downloaded to /var/www/html/d8dev/modules/colorbox. [success] $ drushencolorbox The following extensions will be enabled: colorbox Do you really want to continue? (y/n): y colorbox was enabled successfully. [ok] The Colorbox module depends on the Colorbox jQuery plugin available at https://github.com/jackmoore/colorbox. The Colorbox module includes a Drush task that will download the required jQuery plugin at the /libraries directory: $drushcolorbox-plugin Colorbox plugin has been installed in libraries [success] Next, we will look into the Colorbox display formatter. Click on the Structure link in the Admin toolbar, then click on the Content types link, and finally click on the manage display link for your Recipe content type under the Operations dropdown: Next, click on the FORMAT select list for the Recipe images field, and you will see an option for Colorbox, Select as Colorbox then you will see the settings change. Then, click on the settings icon: Now, you will see the settings for Colorbox. Select Content image style as small and Content image style for first image as small in the dropdown, and use the default settings for other options. Click on the Update button and next on the Save button at the bottom: Reload our Thai Basil Chicken recipe page, and you should see something similar to the following (with the new image style, small): Now, click on any image and then you will see the image loaded in the colorbox popup: We have learned more about images for colorbox, but colorbox also supports videos. Another way to add some spice to our site is by adding videos. So there are several modules available to work with colorbox for videos. The Video Embed Field module creates a simple field type that allows you to embed videos from YouTube and Vimeo and show their thumbnail previews simply by entering the video's URL. So you can try this module to add some pizzazz to your site! What just happened? We installed the Colorbox module and enabled it for the Recipe images field on our custom Recipe content type. Now, we can easily add images to our d8dev content with the Colorbox pop-up feature. Working with Drupal issue queues Drupal has its own issue queue for working with a team of developers around the world. If you need help for a specific project, core, module, or a theme related, you should go to the issue queue, where the maintainers, users, and followers of the module/theme communicate. The issue page provides a filter option, where you can search for specific issues based on Project, Assigned, Submitted by, Followers, Status, Priority, Category, and so on. We can find issues at https://www.drupal.org/project/issues/colorbox. Here, replace colorbox with the specific module name. For more information, see https://www.drupal.org/issue-queue. In our case, we have one issue with the colorbox module. Captions are working for the Automatic and Content title properties, but are not working for the Alt text and Title text properties. To check this issue, go to Structure | Content types and click on Manage display. On the next screen, click on the settings icon for the Recipe images field. Now select the Caption option as Title text or Alt text and click on the Update button. Finally, click on the Save button. Reload the Thai Basil Chicken recipe page, and click on any image. Then it opens in popup, but we cannot see captions for this. Make sure you have the Title text and Alt text properties updated for Recipe images field for the Thai Basil Chicken recipe. Time for action – creating an issue for the Colorbox module Now, before we go and try to figure out how to fix this functionality for the Colorbox module, let's create an issue: On https://www.drupal.org/project/issues/colorbox, click on the Create a new issue link: On the next screen we will see a form. We will fill in all the required fields: Title, Category as Bug report, Version as 8.x-1.x-dev, Component as Code, and the Issue summary field. Once I submitted my form, an issue was created at https://www.drupal.org/node/2645160. You should see an issue on Drupal (https://www.drupal.org/) like this: Next, the Maintainers of the colorbox module will look into this issue and reply accordingly. Actually, @frjo replied saying "I have never used that module but if someone who does sends in a patch I will take a look at it." He is a contributor to this module, so we will wait for some time and will see if someone can fix this issue by giving a patch or replying with useful comments. In case someone gives the patch, then we have to apply that to the colorbox module. This information is available on Drupal at https://www.drupal.org/patch/apply . What just happened? We understood and created an issue in the Colorbox module's issue queue list. We also looked into what the required fields are and how to fill them to create an issue in the Drupal module queue list form. Summary In this article, we looked at a way to use our d8dev site with multimedia, creating image styles using some custom code, and learned some new ways of interacting with the Drupal developer community. We also worked with the Colorbox module to add images to our d8dev content with the Colorbox pop-up feature. Lastly, we looked into the custom module to work with custom CSS files. Resources for Article: Further resources on this subject: Installing Drupal 8 [article] Drupal 7 Social Networking: Managing Users and Profiles [article] Drupal 8 and Configuration Management [article]
Read more
  • 0
  • 0
  • 33605

article-image-communication-and-network-security
Packt
21 Jun 2016
7 min read
Save for later

Communication and Network Security

Packt
21 Jun 2016
7 min read
In this article by M. L. Srinivasan, the author of the book CISSP in 21 Days, Second Edition, the communication and network security domain deals with the security of voice and data communications through Local area, Wide area, and Remote access networking. Candidates are expected to have knowledge in the areas of secure communications; securing networks; threats, vulnerabilities, attacks, and countermeasures to communication networks; and protocols that are used in remote access. (For more resources related to this topic, see here.) Observe the following diagram. This represents seven layers of the OSI model. This article covers protocols and security in the fourth layer, which is the Transport layer: Transport layer protocols and security The Transport layer does two things. One is to pack the data given out by applications to a format that is suitable for transport over the network, and the other is to unpackthe data received from the network to a format suitable for applications. In this layer, some of the important protocols are Transmission Control Protocol (TCP), User Datagram Protocol (UDP), Stream Control Transmission Protocol (SCTP), Datagram Congestion Control Protocol (DCCP), and Fiber Channel Protocol (FCP). The process of packaging the data packets received from the applications is called encapsulation, and the output of such a process is called a datagram. Similarly, the process of unpacking the datagram received from the network is called decapstulation. When moving from the seventh layer down to the fourth one, when the fourth layer's header is placed on data, it comes as a datagram. When the datagram is encapsulated with the third layer's header, it becomes a packet, the encapsulated packet becomes a frame, and puts on the wire as bits. The following section describes some of the important protocols in this layer along with security concerns and countermeasures. Transmission Control Protocol (TCP) It is a core Internet protocol that provides reliable delivery mechanisms over the Internet. TCP is a connection-oriented protocol. A protocol that guarantees the delivery of datagram (packets) to the destination application by way of a suitable mechanism (for example, a three-way handshake SYN, SYN-ACK, and ACK in TCP) is called a connection-oriented protocol. The reliability of the datagram delivery of such protocol is high due to the acknowledgment part by the receiver. This protocol has two primary functions. The primary function of TCP is the transmission of datagram between applications, and the secondary one is in terms of controls that are necessary for ensuring reliable transmissions. Applications where the delivery needs to be assured such as e-mail, the World Wide Web (WWW), file transfer,and so on use TCP for transmission. Threats, vulnerabilities, attacks, and countermeasures One of the common threats to TCP is a service disruption. A common vulnerability is half-open connections exhausting the server resources. The Denial of Service attacks such as TCP SYN attacks as well as connection hijacking such as IP Spoofing attacks are possible. A half-open connection is a vulnerability in the TCP implementation.TCP uses a three-way handshake to establish or terminate connections. Refer to the following diagram: In a three-way handshake, the client first (workstation) sends a request to the server (for example www.SomeWebsite.com). This is called a SYN request. The server acknowledges the request by sending a SYN-ACK, and in the process, it creates a buffer for this connection. The client does a final acknowledgement by ACK. TCP requires this setup, since the protocol needs to ensure the reliability of the packet delivery. If the client does not send the final ACK, then the connection is called half open. Since the server has created a buffer for that connection,a certain amount of memory or server resource is consumed. If thousands of such half-open connections are created maliciously, then the server resources maybe completely consumed resulting in the Denial-of-Service to legitimate requests. TCP SYN attacks are technically establishing thousands of half-open connections to consume the server resources. There are two actions that an attacker might do. One is that the attacker or malicious software will send thousands of SYN to the server and withheld ACK. This is called SYN flooding. Depending on the capacity of the network bandwidth and the server resources, in a span of time,all the resources will be consumed resulting in the Denial-of-Service. If the source IP was blocked by some means, then the attacker or the malicious software would try to spoof the source IP addresses to continue the attack. This is called SYN spoofing. SYN attacks such as SYN flooding and SYN spoofing can be controlled using SYN cookies with cryptographic hash functions. In this method, the server does not create the connection at the SYN-ACK stage. The server creates a cookie with the computed hash of the source IP address, source port, destination IP, destination port, and some random values based on the algorithm and sends it as SYN-ACK. When the server receives an ACK, it checks the details and creates the connection. A cookie is a piece of information usually in the form of text file sent by the server to a client. Cookies are generally stored in browser disk or client computers, and they are used for purposes such as authentication, session tracking, and management. User Datagram Protocol (UDP) UDP is a connectionless protocol and is similar to TCP. However, UDP does not provide the delivery guarantee of data packets. A protocol that does not guarantee the delivery of datagram (packets) to the destination is called connectionless protocol. In other words, the final acknowledgment is not mandatory in UDP. UDP uses one-way communication. The speed delivery of the datagram by UDP is high. UDP is predominantly used where a loss of intermittent packets is acceptable such as video or audio streaming. Threats, vulnerabilities, attacks, and countermeasures Service disruptions are common threats, and validation weaknesses facilitate such threats. UDP flood attacks cause service disruptions, and controlling UDP packet size acts as a countermeasure to such attacks. Internet Control Message Protocol (ICMP) ICMP is used to discover service availability in network devices, servers ,and so on. ICMP expects response messages from devices or systems to confirm the service availability. Threats, vulnerabilities, attacks, and countermeasures Service disruptions are common threats. Validation weaknesses facilitate such threats. ICMP flood attacks, such as the ping of death, causes service disruptions; and controlling ICMP packet size acts as a countermeasure to such attacks. Pinging is a process of sending the Internet Control Message Protocol (ICMP) ECHO_REQUEST message to servers or hosts to check whether they are up and running. In this process,the server or host on the network responds to a ping request, and such a response is called echo. A ping of death refers to sending large numbers of ICMP packets to the server to crash the system. Other protocols in transport layer Stream Control Transmission Protocol (SCTP): This is a connection-oriented protocol similar to TCP, but it provides facilities such as multi-streaming and multi-homing for better performance and redundancy. It is used in UNIX-like operating systems. Datagram Congestion Control Protocol (DCCP): As the name implies, this is a Transport layer protocol that is used for congestion control. Applications her include the Internet telephony and video/audio streaming over the network. Fiber Channel Protocol (FCP): This protocol is used in high-speed networking. One of the prominent applications here is Storage Area Network (SAN). Storage Area Network (SAN) is a network architecture used to attach remote storage devices, such as tape drives anddisk arrays, to the local server. This facilitates using storage devices as if they are local devices. Summary This article covers protocols and security in thetransport layer, which is the fourth layer. Resources for Article: Further resources on this subject: The GNS3 orchestra [article] CISSP: Vulnerability and Penetration Testing for Access Control [article] CISSP: Security Measures for Access Control [article]
Read more
  • 0
  • 0
  • 20300

article-image-creating-multitenant-applications-azure
Packt
21 Jun 2016
18 min read
Save for later

Creating Multitenant Applications in Azure

Packt
21 Jun 2016
18 min read
This article, written by Roberto Freato and Marco Parenzan, is from the book Mastering Cloud Development using Microsoft Azure by Packt Publishing, and it teaches us how to create multitenant applications in Azure. This book guides you through the many efficient ways of mastering the cloud services and using Microsoft Azure and its services to its maximum capacity. (For more resources related to this topic, see here.) A tenant is a private space for a user or a group of users in an application. A typical way to identify a tenant is by its domain name. If multiple users share a domain name, we say that these users live inside the same tenant. If a group of users use a different reserved domain name, they live in a reserved tenant. From this, we can infer that different names are used to identify different tenants. Different domain names can imply different app instances, but we cannot say the same about deployed resources. Multitenancy is one of the funding principles of Cloud Computing. Developers need to reach economy of scale, which allows every cloud user to scale as needed without paying for overprovisioned resources or suffering for underprovisioned resources. To do this, cloud infrastructure needs to be oversized for a single user and sized for a pool of potential users that share the same group of resources during a certain period of time. Multitenancy is a pattern. Legacy on-premise applications usually tend to be a single-tenant app, shared between users because of the lack of specific DevOps tasks. Provisioning an app for every user can be a costly operation. Cloud environments invite reserving a single tenant for each user (or group of users) to enforce better security policies and to customize tenants for specific users because all DevOps tasks can be automated via management APIs. The cloud invites reserving resource instances for a tenant and deploying a group of tenants on the same resources. In general, this is a new way of handling app deployment. We will now take a look at how to develop an app in this way. Scenario CloudMakers.xyz, a cloud-based development company, decided to develop a personal accountant web application—MyAccountant. Professionals or small companies can register themselves on this app as a single customer and record all of their invoices on it. A single customer represents the tenant; different companies use different tenants. Every tenant needs its own private data to enforce data security, so we will reserve a dedicated database for a single tenant. Access to a single database is not an intensive task because invoice registration will generally occur once daily. Every tenant will have its own domain name to enforce company identity. A new tenant can be created from the company portal application, where new customers register themselves, specifying the tenant name. For sample purposes, without the objective of creating production-quality styling, we use the default ASP.NET MVC templates to style and build up apps and focus on tenant topics. Creating the tenant app A tenant app is an invoice recording application. To brand the tenant, we record tenant name in the app settings inside the web.config file: <add key="TenantName" value="{put_your_tenant_name}" /> To simplify this, we "brand" the application that stores the tenant name in the main layout file where the application name is displayed. The application content is represented by an Invoices page where we record data with a CRUD process. The entry for the Invoices page is in the Navigation bar: <ul class="nav navbar-nav"> <li>@Html.ActionLink("Home", "Index", "Home")</li> <li>@Html.ActionLink("Invoices", "Index", "Invoices")</li> <!-- other code omitted --> First, we need to define a model for the application in the models folder. As we need to store data in an Azure SQL database, we can use entity framework to create the model from an empty code. However, first we use the following code: public class InvoicesModel : DbContext { public InvoicesModel() : base("name=InvoicesModel") { } public virtual DbSet<Invoice> Invoices { get; set; } } As we can see, data will be accessed by a SQL database that is referenced by a connectionString in the web.config file: <add name="InvoicesModel" connectionString="data source=(LocalDb)MSSQLLocalDB;initial catalog=Tenant.Web.Models.InvoicesModel;integrated security=True;MultipleActiveResultSets=True; App=EntityFramework" providerName="System.Data.SqlClient" /></connectionStrings> This model class is just for demo purposes: public class Invoice { public int InvoiceId { get; set; } public int Number { get; set; } public DateTime Date { get; set; } public string Customer { get; set; } public decimal Amount { get; set; } public DateTime DueDate { get; set; } } After this, we try to compile the project to check whether we have not made any mistake. We can now scaffold this model into an MVC controller so that we can have a simple but working app skeleton. Creating the portal app We now need to create the portal app starting from the MVC default template. Its registration workflow is useful for the creation of our tenant registration. In particular, we utilize user registration as the tenant registration. The main information acquires the tenant name and triggers tenant deployment. We need to make two changes on the UI. First, in the RegisterViewModel defined under the Models folder, we add a TenantName property to the AccountViewModels.cs file: public class RegisterViewModel { [Required] [Display(Name = "Tenant Name")] public string TenantName { get; set; } [Required] [EmailAddress] [Display(Name = "Email")] public string Email { get; set; } // other code omitted } In the Register.cshtml view page under ViewsAccount folder, we add an input box: @using (Html.BeginForm("Register", "Account", FormMethod.Post, new { @class = "form-horizontal", role = "form" })) { @Html.AntiForgeryToken() <h4>Create a new account.</h4> <hr /> @Html.ValidationSummary("", new { @class = "text-danger" }) <div class="form-group"> @Html.LabelFor(m => m.TenantName, new { @class = "col-md-2 control-label" }) <div class="col-md-10"> @Html.TextBoxFor(m => m.TenantName, new { @class = "form-control" }) </div> </div> <div class="form-group"> @Html.LabelFor(m => m.Email, new { @class = "col-md-2 control-label" }) <div class="col-md-10"> @Html.TextBoxFor(m => m.Email, new { @class = "form- control" }) </div> </div> <!-- other code omitted --> } Portal application can be great to allow the tenant owner to manage its own tenant, configuring or handling subscription-related tasks to the supplier company. Deploying the portal application Before tenant deployment, we need to deploy the portal itself. MyAccountant is a complex solution made up of multiple Azure services, which needs to be deployed together. First, we need to create an Azure Resource Group to collect all the services: As we already discussed earlier, all data from different tenants, including the portal itself, need to be contained inside distinct Azure SQL databases. Every user will have their own DB as a personal service, which they don't use frequently. It can be a waste of money assigning a reserved quantity of Database Transaction Units (DTUs) to a single database. We can invest on a pool of DTUs that should be shared among all SQL database instances. We begin by creating an SQL Server service from the portal: We need to create a pool of DTUs, which are shared among databases, and configure the pricing tier, which defines the maximum resources allocation per DB: The first database that we need to manually deploy is the portal database, where users will register as tenants. From the MyAccountantPool blade, we can create a new database that will be immediately associated to the pool: From the database blade, we read the connection: We use this connection string to configure the portal app in web.config: <connectionStrings> <add name="DefaultConnection" connectionString="Server=tcp: {portal_db}.database.windows.net,1433;Data Source={portal_db}; .database.windows.net;Initial Catalog=Portal;Persist Security Info=False;User ID={your_username};Password={your_password}; Pooling=False;MultipleActiveResultSets=False;Encrypt=True; TrustServerCertificate=False;Connection Timeout=30;" providerName="System.Data.SqlClient" /> </connectionStrings> We need to create a shared resource for the Web. In this case, we need to create an App Service Plan where we'll host portal and tenants apps. The initial size is not a problem because we can decide to scale up or scale out the solution at any time (in this case, only when application is able to scale out—we don't handle this scenario here). Then, we need to create portal web app that will be associated with the service plan that we just created: The portal can be deployed from Visual Studio to the Azure subscription by right-clicking on the project root in Solution Explorer and selecting Microsoft Azure Web App from Publish. After deployment, the portal is up and running: Deploy the tenant app After tenant registration from the portal, we need to deploy tenant itself, which is made up of the following: The app itself that is considered as the artifact that has to be deployed A web app that runs the app, hosted on the already defined web app plan The Azure SQL database that contains data inside the elastic pool The connection string that connect database to the web app in the web.config file It's a complex activity because it involves many different resources and different kinds of tasks from deployment to configuration. For this purpose, we have the Azure Resource Group project in Visual Studio, where we can configure web app deployment and configuration via Azure Resource Manager templates. This project will be called Tenant.Deploy, and we choose a blank template to do this. In the azuredeploy.json file, we can type a template such as https://github.com/marcoparenzan/CreateMultitenantAppsInAzure/blob/master/Tenant.Deploy/Templates/azuredeploy.json. This template is quite complex. Remember that in the SQL connection string, the username and password should be provided inside the template. We need to reference the Tenant.Web project from the deployment project because we need to deploy tenant artifacts (the project bits). To support deployment, we need to create an Azure Storage Account back to the Azure portal: To understand how it works, we can manually run a deployment directly from Visual Studio by right-clicking on Deployment project from Solution Explorer and selecting Deploy. When we deploy a "sample" tenant, the first dialog will appear. You can connect to the Azure subscription, selecting an existing resource group or creating a new one and the template that describes the deployment composition. The template requires the following parameters from Edit Parameters window: The tenant name The artifact location and SAS token that are automatically added having selected the Azure Storage account from the previous dialog Now, via the included Deploy-AzureResourceGroup.ps1 PowerShell file, Azure resources are deployed. The artifact is copied with AzCopy.exe command to the Azure storage in the Tenant.Web container as a package.zip file and the resource manager starts allocating resources. We can see that tenant is deployed in the following screenshot: Automating the tenant deployment process Now, in order to complete our solution, we need to invoke this deployment process from the portal application during a registration process call in ASP.NET MVC controls. For the purpose of this article, we will just invoke the execution without defining a production-quality deployment process. We can use the following checklist before proceeding: We already have an Azure Resource Manager template that deploys the tenant app customized for the user Deployment is made with a PowerShell script in the Visual Studio deployment project A new registered user for our application does not have an Azure account; we, as service publisher, need to offer a dedicated Azure account with our credentials to deploy the new tenants Azure offers many different ways to interact with an Azure subscription: The classic portal (https://manage.windowsazure.com) The new portal (https://portal.azure.com) The resource portal (https://resources.azure.com) The Azure REST API (https://msdn.microsoft.com/en-us/library/azure/mt420159.aspx) The Azure .NET SDK (https://github.com/Azure/azure-sdk-for-net) and other platforms The Azure CLI open source CLI (https://github.com/Azure/azure-xplat-cli) PowerShell (https://github.com/Azure/azure-powershell) For our needs, this means integrating in our application. We can make these considerations: We need to reuse the same ARM template that we defined We can reuse PowerShell experience, but we can also use our experience as .NET, REST, or other platform developers Authentication is the real discriminator in our solution: the user is not an Azure subscription user and we don't want to make a constraint on this Interacting with Azure REST API, which is the API on which every other solution depends, requires that all invocations need to be authenticated to the Azure Active Directory of the subscription tenant. We already mentioned that the user is not a subscription-authenticated user. Therefore, we need an unattended authentication to our Azure API subscription using a dedicated user for this purpose, encapsulated into a component that is executed by the ASP.NET MVC application in a secure manner to make the tenant deployment. The only environment that offers an out-of-the box solution for our needs (so that we need to write less code) is the Azure Automation Service. Before proceeding, we create a dedicated user for this purpose. Therefore, for security reasons, we can disable a specific user at any time. You should take note of two things: Never use the credentials that you used to register Azure subscription in a production environment! For automation implementation, you need a Azure AD tenant user, so you cannot use Microsoft accounts (Live or Hotmail). To create the user, we need to go to the classic portal, as Azure Active Directory has no equivalent management UI in the new portal. We need to select the tenant directory, that is, the one in the new portal that is visible in the upper right corner. From the classic portal, go to to Azure Active Directory and select the tenant. Click on Add User and type in a new username: Then, go to Administrator Management in the Setting tab of the portal because we need to define the user as a co-administrator in the subscription that we need to use for deployment. Now, with the temporary password, we need to log in manually to https://portal.azure.com/ (open the browser in private mode) with these credentials because we need to change the password, as it is generated as "expired". We are now ready to proceed. Back in the new portal, we select a new Azure Automation account: The first thing that we need to do inside the account is create a credential asset to store the newly-created AAD credentials and use the inside PowerShell scripts to log on in Azure: We can now create a runbook, which is an automation task that can be expressed in different ways: Graphical PowerShell We choose the second one: As we can edit it directly from portal, we can write a PowerShell script for our purposes. This is an adaptation from the one that we used in a standard way in the deployment project inside Visual Studio. The difference is that it is runable inside a runbook and Azure, and it uses already deployed artifacts that are already in the Azure Storage account that we created before. Before proceeding, we need of two IDs from our subscription: The subscription ID The tenant ID These two parameters can be discovered with PowerShell because we can perform Login-AzureRmAccount. Run it through the command line and copy them from the output: The following code is not production quality (needs some optimization) but for demo purposes: param ( $WebhookData, $TenantName ) # If runbook was called from Webhook, WebhookData will not be null. if ($WebhookData -ne $null) { $Body = ConvertFrom-Json -InputObject $WebhookData.RequestBody $TenantName = $Body.TenantName } # Authenticate to Azure resources retrieving the credential asset $Credentials = Get-AutomationPSCredential -Name "myaccountant" $subscriptionId = '{your subscriptionId}' $tenantId = '{your tenantId}' Login-AzureRmAccount -Credential $Credentials -SubscriptionId $subscriptionId -TenantId $tenantId $artifactsLocation = 'https://myaccountant.blob.core.windows.net/ myaccountant-stageartifacts' $ResourceGroupName = 'MyAccountant' # generate a temporary StorageSasToken (in a SecureString form) to give ARM template the access to the templatea artifacts$StorageAccountName = 'myaccountant' $StorageContainer = 'myaccountant-stageartifacts' $StorageAccountKey = (Get-AzureRmStorageAccountKey - ResourceGroupName $ResourceGroupName -Name $StorageAccountName).Key1 $StorageAccountContext = (Get-AzureRmStorageAccount - ResourceGroupName $ResourceGroupName -Name $StorageAccountName).Context $StorageSasToken = New-AzureStorageContainerSASToken -Container $StorageContainer -Context $StorageAccountContext -Permission r -ExpiryTime (Get-Date).AddHours(4) $SecureStorageSasToken = ConvertTo-SecureString $StorageSasToken -AsPlainText -Force #prepare parameters for the template $ParameterObject = New-Object -TypeName Hashtable $ParameterObject['TenantName'] = $TenantName $ParameterObject['_artifactsLocation'] = $artifactsLocation $ParameterObject['_artifactsLocationSasToken'] = $SecureStorageSasToken $deploymentName = 'MyAccountant' + '-' + $TenantName + '-'+ ((Get-Date).ToUniversalTime()).ToString('MMdd-HHmm') $templateLocation = $artifactsLocation + '/Tenant.Deploy/Templates/azuredeploy.json' + $StorageSasToken # execute New-AzureRmResourceGroupDeployment -Name $deploymentName ` -ResourceGroupName $ResourceGroupName ` -TemplateFile $templateLocation ` @ParameterObject ` -Force -Verbose The script is executable in the Test pane, but for production purposes, it needs to be deployed with the Publish button. Now, we need to execute this runbook from outside ASP.NET MVC portal that we already created. We can use Webhooks for this purpose. Webhooks are user-defined HTTP callbacks that are usually triggered by some event. In our case, this is new tenant registration. As they use HTTP, they can be integrated into web services without adding new infrastructure. Runbooks can directly be exposed as a Webhooks that provides HTTP endpoint natively without the need to provide one by ourself. We need to remember some things: Webhooks are public with a shared secret in the URL, so it is "secure" if we don't share it As a shared secret, it expires, so we need to handle Webhook update in the service lifecycle As a shared secret if more users are needed, more Webhooks are needed, as the URL is the only way to recognize who invoked it (again, don't share Webhooks) Copy the URL at this stage as it is not possible to recover it but it needs to be deleted and generate a new one Write it directly in portal web.config app settings: <add key="DeplyNewTenantWebHook" value="https://s2events.azure- automation.net/webhooks?token={your_token}"/> We can set some default parameters if needed, then we can create it. To invoke the Webhook, we use System.Net.HttpClient to create a POST request, placing a JSON object containing TenantName in the body: var requestBody = new { TenantName = model.TenantName }; var httpClient = new HttpClient(); var responseMessage = await httpClient.PostAsync( ConfigurationManager.AppSettings ["DeplyNewTenantWebHook"], new StringContent(JsonConvert.SerializeObject (requestBody)) ); This code is used to customize the registration process in AccountController: public async Task<ActionResult> Register(RegisterViewModel model) { if (ModelState.IsValid) { var user = new ApplicationUser { UserName = model.Email, Email = model.Email }; var result = await UserManager.CreateAsync(user, model.Password); if (result.Succeeded) { await SignInManager.SignInAsync(user, isPersistent:false, rememberBrowser:false); // handle webhook invocation here return RedirectToAction("Index", "Home"); } AddErrors(result); } The responseMessage is again a JSON object that contains JobId that we can use to programmatically access the executed job. Conclusion There are a lot of things that can be done with the set of topics that we covered in this article. These are a few of them: We can write better .NET code for multitenant apps We can authenticate users on with the Azure Active Directory service We can leverage deployment tasks with Azure Service Bus messaging We can create more interaction and feedback during tenant deployment We can learn how to customize ARM templates to deploy other Azure Storage services, such as DocumentDB, Azure Storage, and Azure Search We can handle more PowerShell for the Azure Management tasks Summary Azure can change the way we write our solutions, giving us a set of new patterns and powerful services to develop with. In particular, we learned how to think about multitenant apps to ensure confidentiality to the users. We looked at deploying ASP.NET web apps in app services and providing computing resources with App Services Plans. We looked at how to deploy SQL in Azure SQL databases and computing resources with elastic pool. We declared a deployment script with Azure Resource Manager, Azure Resource Template with Visual Studio cloud deployment projects, and automated ARM PowerShell script execution with Azure Automation and runbooks. The content we looked at in the earlier section will be content for future articles. Code can be found on GitHub at https://github.com/marcoparenzan/CreateMultitenantAppsInAzure. Have fun! Resources for Article: Further resources on this subject: Introduction to Microsoft Azure Cloud Services [article] Microsoft Azure – Developing Web API for Mobile Apps [article] Security in Microsoft Azure [article]
Read more
  • 0
  • 0
  • 28861

article-image-five-common-questions-netjava-developers-learning-javascript-and-nodejs
Packt
20 Jun 2016
19 min read
Save for later

Five common questions for .NET/Java developers learning JavaScript and Node.js

Packt
20 Jun 2016
19 min read
In this article by Harry Cummings, author of the book Learning Node.js for .NET Developers For those with web development experience in .NET or Java, perhaps who've written some browser-based JavaScript in the past, it might not be obvious why anyone would want to take JavaScript beyond the browser and treat it as a general-purpose programming language. However, this is exactly what Node.js does. What's more, Node.js has been around for long enough now to have matured as a platform, and has sustained its impressive growth in popularity well beyond any period that could be attributed to initial hype over a new technology. In this introductory article, we'll look at why Node.js is a compelling technology worth learning more about, and address some of the common barriers and sources of confusion that developers encounter when learning Node.js and JavaScript. (For more resources related to this topic, see here.) Why use Node.js? The execution model of Node.js follows that of JavaScript in the browser. This might not be an obvious choice for server-side development. In fact, these two use cases do have something important in common. User interface code is naturally event-driven (for example, binding event handlers to button clicks). Node.js makes this a virtue by applying an event-driven approach to server-side programming. Stated formally, Node.js has a single-threaded, non-blocking, event-driven execution model. We'll define each of these terms. Non-blocking Put simply, Node.js recognizes that many programmes spend most of their time waiting for other things to happen. For example, slow I/O operations such as disk access and network requests. Node.js addresses this by making these operations non-blocking. This means that program execution can continue while they happen. This non-blocking approach is also called asynchronous programming. Of course, other platforms support this too (for example, C#'s async/await keywords and the Task Parallel Library). However, it is baked in to Node.js in a way that makes it simple and natural to use. Asynchronous API methods are all called in the same way: They all take a callback function to be invoked ("called back") when the execution completes. This function is invoked with an optional error parameter and the result of the operation. The consistency of calling non-blocking (asynchronous) API methods in Node.js carries through to its third-party libraries. This consistency makes it easy to build applications that are asynchronous throughout. Other JavaScript libraries, such as bluebird (http://bluebirdjs.com/docs/getting-started.html), allow callback-based APIs to be adapted to other asynchronous patterns. As an alternative to callbacks, you may choose to use Promises (similar to Tasks in .NET or Futures in Java) or coroutines (similar to async methods in C#) within your own codebase. This allows you to streamline your code while retaining the benefits of consistent asynchronous APIs in Node.js and its third-party libraries. Event-driven The event-driven nature of Node.js describes how operations are scheduled. In typical procedural programming environments, a program has some entry point that executes a set of commands until completion or enters a loop to perform work on each iteration. Node.js has a built-in event loop, which isn't directly exposed to the developer. It is the job of the event loop to decide which piece of code to execute next. Typically, this will be some callback function that is ready to run in response to some other event. For example, a filesystem operation may have completed, a timeout may have expired, or a new network request may have arrived. This built-in event loop simplifies asynchronous programming by providing a consistent approach and avoiding the need for applications to manage their own scheduling. Single-threaded The single-threaded nature of Node.js simply means that there is only one thread of execution in each process. Also, each piece of code is guaranteed to run to completion without being interrupted by other operations. This greatly simplifies development and makes programmes easier to reason about. It removes the possibility for a range of concurrency issues. For example, it is not necessary to synchronize/lock access to shared in-process state as it is in Java or .NET. A process can't deadlock itself or create race conditions within its own code. Single-threaded programming is only feasible if the thread never gets blocked waiting for long-running work to complete. Thus, this simplified programming model is made possible by the non-blocking nature of Node.js. Writing web applications The flagship use case for Node.js is in building websites and web APIs. These are inherently event-driven as most or all processing takes place in response to HTTP requests. Also, many websites do little computational heavy-lifting of their own. They tend to perform a lot of I/O operations, for example: Streaming requests from the client Talking to a database locally or over the network Pulling in data from remote APIs over the network Reading files from disk to send back to the client These factors make I/O operations a likely bottleneck for web applications. The non-blocking programming model of Node.js allows web applications to make the most of a single thread. As soon as any of these I/O operations starts, the thread is immediately free to pick up and start processing another request. Processing of each request continues via asynchronous callbacks when I/O operations complete. The processing thread is only kicking off and linking together these operations, never waiting for them to complete. This allows Node.js to handle a much higher rate of requests per thread than other runtime environments. How does Node.js scale? So, Node.js can handle many requests per thread, but what happens when we reach the limit of what one thread can handle? The answer is, of course, to use more threads! You can achieve this by starting multiple Node.js processes, typically, one for each web server CPU core. Note that this is still quite different to most Java or .NET web applications. These typically use a pool of threads much larger than the number of cores, because threads are expected to spend much of their time being blocked. The built-in Node.js cluster module makes it straightforward to spawn multiple Node.js processes. Tools such as PM2 (http://pm2.keymetrics.io/) and libraries such as throng (https://github.com/hunterloftis/throng) make it even easier to do so. This approach gives us the best of all worlds: Using multiple threads makes the most of our available CPU power By having a single thread per core, we also save overheads from the operating system context-switching between threads Since the processes are independent and don't share state directly, we retain the benefits of the single-threaded programming model discussed above By using long-running application processes (as with .NET or Java), we avoid the overhead of a process-per-request (as in PHP) Do I really have to use JavaScript? A lot of web developers new to Node.js will already have some experience of client-side JavaScript. This experience may not have been positive and might put you off using JavaScript elsewhere. You do not have to use JavaScript to work with Node.js. TypeScript (http://www.typescriptlang.org/) and other compile-to-JavaScript languages exist as alternatives. However, I do recommend learning Node.js with JavaScript first. It will give you a clearer understanding of Node.js and simplify your tool chain. Once you have a project or two under your belt, you'll be better placed to understand the pros and cons of other languages. In the meantime, you might be pleasantly surprised by the JavaScript development experience in Node.js. There are three broad categories of prior JavaScript development experience that can lead to people having a negative impression of it. These are as follows: Experience from the late 90s and early 00s, prior to MV* frameworks like Angular/Knockout/Backbone/Ember, maybe even prior to jQuery. This is the pioneer phase of client-side web development. More recent experience within the much more mature JavaScript ecosystem, perhaps as a full-stack developer writing server-side and client-side code. The complexity of some frameworks (such as the MV* frameworks listed earlier), or the sheer amount of choice in general, can be overwhelming. Limited experience with JavaScript itself, but exposure to some its most unusual characteristics. This may lead to a jarring sensation as a result of encountering the language in surprising or unintuitive ways. We'll address groups of people affected by each type of experience in turn. But note that individuals might identify with more than one of these groups. I'm happy to admit that I've been a member of all three in the past. The web pioneers These developers have been burned by worked with client-side JavaScript in the past. The browser is sometimes described as a hostile environment for code to execute in. A single execution context shared by all code allows for some particularly nasty gotchas. For example, third-party code on the same page can create and modify global objects. Node.js solves some of these issues on a fundamental level, and mitigates others where this isn't possible. It's JavaScript, so it's still the case that everything is mutable. But the Node.js module system narrows the global scope, so libraries are less likely to step on each other's toes. The conventions that Node.js establishes also make third-party libraries much more consistent. This makes the environment less hostile and more predictable. The web pioneers will also have had to cope with the APIs available to JavaScript in the browser. Although these have improved over time as browsers and standards have matured, the earlier days of web development were more like the Wild West. Quirks and inconsistencies in fundamental APIs caused a lot of hard work and frustration. The rise of jQuery is a testament to the difficulty of working with the Document Object Model of old. The continued popularity of jQuery indicates that people still prefer to avoid working with these APIs directly. Node.js addresses these issues quite thoroughly: First of all, by taking JavaScript out of the browser, the DOM and other APIs simply go away as they are no longer relevant. The new APIs that Node.js introduces are small, focused, and consistent. You no longer need to contend with inconsistencies between browsers. Everything you write will execute in the same JavaScript engine (V8). The overwhelmed full-stack developers Many of the frontend JavaScript frameworks provide a lot of power, but come with a great deal of complexity. For example, AngularJS has a steep learning curve, is quite opinionated about application structure, and has quite a few gotchas or things you just need to know. JavaScript itself is actually a language with a very small surface area. This provides a blank canvas for Node.js to provide a small number of consistent APIs (as described in the previous section). Although there's still plenty to learn in total, you can focus on just the things you need without getting tripped up by areas you're not yet familiar with. It's still true that there's a lot of choice and that this can be bewildering. For example, there are many competing test frameworks for JavaScript. The trend towards smaller, more composable packages in the Node.js ecosystem—while generally a good thing—can mean more research, more decisions, and fewer batteries-included frameworks that do everything out of the box. On balance though, this makes it easier to move at your own pace and understand everything that you're pulling into your application. The JavaScript dabblers It's easy to have a poor impression of JavaScript if you've only worked with it occasionally and never as the primary (or even secondary) language on a project. JavaScript doesn't do itself any favors here, with a few glaring gotchas that most people will encounter. For example, the fundamentally broken == equality operator and other symptoms of type coercion. Although these make a poor first impression, they aren't really indicative of the experience of working with JavaScript more regularly. As mentioned in the previous section, JavaScript itself is actually a very small language. Its simplicity limits the number of gotchas there can be. While there are a few things you "just need to know", it's a short list. This compares well against the languages that offer a constant stream of nasty surprises (for example, PHP's notoriously inconsistent built-in functions). What's more, successive ECMAScript standards have done a lot to clean up the JavaScript language. With Node.js, you get to take advantage of this, as all your code will run on the V8 engine, which implements the latest ES2015 standard. The other big that reason JavaScript can be jarring is more a matter of context than an inherent flaw. It looks superficially similar to the other languages with a C-like syntax, like Java and C#. The similarity to Java was intentional when JavaScript was created, but it's unfortunate. JavaScript's programming model is quite different to other object-oriented languages like Java or C#. This can be confusing or frustrating, when its syntax suggests that it may work in roughly the same way. This is especially true of object-oriented programming in JavaScript, as we'll discuss shortly. Once you've understood the fundamentals of JavaScript though, it's very easy to work productively with it. Working with JavaScript I'm not going to argue that JavaScript is the perfect language. But I do think many of the factors that lead to people having a bad impression of JavaScript are not down to the language itself. Importantly, many factors simply don't apply when you take JavaScript out of the browser environment. What's more, JavaScript has some really great extrinsic properties. These are things that aren't visible in the code, but have an effect on what it's like to work with the language. For example, JavaScript's interpreted nature makes it easy to set up automated tests to run continuously and provide near-instant feedback on your code changes. How does inheritance work in JavaScript? When introducing object-oriented programming, we usually talk about classes and inheritance. Java, C# and numerous other languages take a very similar approach to these concepts. JavaScript is quite unusual in that; it supports object-oriented programming without classes. It does this by applying the concept of inheritance directly to objects. Anything that is not one of JavaScript's built-in primitives (strings, number, null, and so on) is an object. Functions are just a special type of object that can be invoked with arguments. Arrays are a special type of object with list-like behavior. All objects (including these two special types) can have properties, which are just names with a value. You can think of JavaScript objects as a dictionary with string keys and object values. Programming without classes Let's say you have a chart with a very large number of data points. These points may be represented by objects that have some common behavior. In C# or Java, you might create a Point class. In JavaScript, you could implement points like this: function create Point(x, y) {     return {         x: x,         y: y,         isAboveDiagonal: function() {             return this.y > this.x;         }     }; }   var myPoint = createPoint(1, 2); console.log(myPoint.isAboveDiagonal()); // Prints "true" The createPoint function returns a new object each time it is called (the object is defined using JavaScript's object-literal notation, which is the basis for JSON). One problem with this approach is that the function assigned to the isAboveDiagonal property is redefined for each point on the graph, thus taking up more space in memory. You can address this using prototypal inheritance. Although JavaScript doesn't have classes, objects can inherit from other objects. Each object has a prototype. If the interpreter attempts to access a property on an object and that property doesn't exist, it will look for a property with the same name on the object's prototype instead. If the property doesn't exist there, it will check the prototype's prototype, and so on. The prototype chain will end with the built-in Object.prototype. You can implement point objects using a prototype as follows: var pointPrototype = {     isAboveDiagonal: function() {         return this.y > this.x;     } };   function createPoint(x, y) {     var newPoint = Object.create(pointPrototype);     newPoint.x = x;     newPoint.y = y;     return newPoint; }   var myPoint = createPoint(1, 2); console.log(myPoint.isAboveDiagonal()); // Prints "true" The Object.create method creates a new object with a specified prototype. The isAboveDiagonal method now only exists once in memory on the pointPrototype object. When the code tries to call isAboveDiagonal on an individual point object, it is not present, but it is found on the prototype instead. Note that the preceding example tells us something important about the behavior of the this keyword in JavaScript. It actually refers to the object that the current function was called on, rather than the object it was defined on. Creating objects with the 'new' keyword You can rewrite the previous code example in a more compact form using the new operator: function Point(x, y) {     this.x = x;     this.y = y; }   Point.prototype.isAboveDiagonal = function() {     return this.y > this.x; }   var myPoint = new Point(1, 2); By convention, functions have a property named prototype, which defaults to an empty object. Using the new operator with the Point function creates an object that inherits from Point.prototype and applies the Point function to the newly created object. Programming with classes Although JavaScript doesn't fundamentally have classes, ES2015 introduces a new class keyword. This makes it possible to implement shared behavior and inheritance in a way that may be more familiar compared to other object-oriented languages. The equivalent of the previous code example would look like the following: class Point {     constructor(x, y) {         this.x = x;         this.y = y;     }         isAboveDiagonal() {         return this.y > this.x;     } }   var myPoint = new Point(1, 2); Note that this really is equivalent to the previous example. The class keyword is just syntactic sugar for setting up the prototype-based inheritance already discussed. Once you know how to define objects and classes, you can start to structure the rest of your application. How do I structure Node.js applications? In C# and Java, the static structure of an application is defined by namespaces or packages (respectively) and static types. An application's run-time structure (that is, the set of objects created in memory) is typically bootstrapped using a dependency injection (DI) container. Examples of DI containers include NInject, Autofac and Unity in .NET, or Spring, Guice and Dagger in Java. These frameworks provide features like declarative configuration and autowiring of dependencies. Since JavaScript is a dynamic, interpreted language, it has no inherent static application structure. Indeed, in the browser, all the scripts loaded into a page run one after the other in the same global context. The Node.js module system allows you to structure your application into files and directories and provides a mechanism for importing functionality from one file into another. There are DI containers available for JavaScript, but they are less commonly used. It is more common to pass around dependencies explicitly. The Node.js module system and JavaScript's dynamic typing makes this approach more natural. You don't need to add a lot of fields and constructors/properties to set up dependencies. You can just wrap modules in an initialization function that takes dependencies as parameters. The following very simple example illustrates the Node.js module system, and shows how to inject dependencies via a factory function: We add the following code under /src/greeter.js: module.exports = function(writer) {     return {         greet: function() { writer.write('Hello World!'); }     } }; We add the following code under /src/main.js: var consoleWriter = {     write: function(string) { console.log(string); } }; var greeter = require('./greeter.js')(consoleWriter); greeter.greet(); In the Node.js module system, each file establishes a new module with its own global scope. Within this scope, Node.js provides the module object for the current module to export its functionality, and the require function for importing other modules. If you run the previous example (using node main.js), the Node.js runtime will load the greeter module as a result of the main module's call to the require function. The greeter module assigns a value to the exports property of the module object. This becomes the return value of the require call back in the main module. In this case, the greeter module exports a single object, which is a factory function that takes a dependency. Summary In this article, we have: Understood the Node.js programming model and its use in web applications Described how Node.js web applications can scale Discussed the suitability of JavaScript as a programming language Illustrated how object-oriented programming works in JavaScript Seen how dependency injection works with the Node.js module system Hopefully this article has given you some insight into why Node.js is a compelling technology, and made you better prepared to learn more about writing server-side applications with JavaScript and Node.js. Resources for Article: Further resources on this subject: Web Components [article] Implementing a Log-in screen using Ext JS [article] Arrays [article]
Read more
  • 0
  • 0
  • 10338

article-image-getting-started-tensorflow-api-primer
Sam Abrahams
19 Jun 2016
8 min read
Save for later

Getting Started with TensorFlow: an API Primer

Sam Abrahams
19 Jun 2016
8 min read
TensorFlow has picked up a lot of steam over the past couple of months, and there's been more and more interest in learning how to use the library. I've seen tons of tutorials out there that just slap together TensorFlow code, roughly describe what some of the lines do, and call it a day. Conversely, I've seen really dense tutorials that mix together universal machine learning concepts and TensorFlow's API. There is value in both of these sorts of examples, but I find them either a little too sparse or too confusing respectively. In this post, I plan to focus solely on information related to the TensorFlow API, and not touch on general machine learning concepts (aside from describing computational graphs). Additionally, I will link directly to relevant portions of the TensorFlow API for further reading. While this post isn't going to be a proper tutorial, my hope is that focusing on the core components and workflows of the TensorFlow API will make working with other resources more accessible and comprehensible. As a final note, I'll be referring to the Python API and not the C++ API in this post. Definitions Let's start off with a glossary of key words you're going to see when using TensorFlow. Tensor: An n-dimensional matrix. For most practical purposes, you can think of them the same way you would a two-dimensional matrix for matrix algebra. In TensorFlow, the return value of any mathematical operation is a tensor. See here for more about TensorFlow Tensor objects. Graph: The computational graph that is defined by the user. It's constructed of nodes and edges, representing computations and connections between those computations respectively. For a quick primer on computation graphs and how they work in backpropagation, check out Chris Olah's post here. A TensorFlow user can define more than one Graph object and run them separately. Additionally, it is possible to define a large graph and run only smaller portions of it. See here for more information about TensorFlow Graphs. Op, Operation (Ops, Operations): Any sort of computation on tensors. Operations (or Ops) can take in zero or more TensorFlow Tensor objects, and output zero or more Tensor objects as a result of the computation. Ops are used all throughout TensorFlow, from doing simple addition to matrix multiplication to initializing TensorFlow variables. Operations run only when they are passed to the Session object, which I'll discuss below. For the most part, nodes and operations are interchangable concepts. In this guide, I'll try to use the term Operation or Op when referring to TensorFlow-specific operations and node when referring to general computation graph terminology. Here's the API reference for the Operation class. Node: A computation in the graph that takes as input zero or more tensors and outputs zero or more tensors. A node does not have to interact with any other nodes, and thus does not have to have any edges connected to it. Visually, these are usually depicted as ellipses or boxes. Edge: The directed connection between two nodes. In TensorFlow, each edge can be seen as one or more tensors, and usually represents the output of one node becoming the input of the next node. Visually, these are usually depicted as lines or arrows. Device: A CPU or GPU. In TensorFlow, computations can occur across many different CPUs and GPUs, and it must keep track of these devices in order to coordinate work properly. The Typical TensorFlow Coding Workflow Writing a working TensorFlow model boils down to two steps: Build the Graph using a series of Operations, placeholders, and Variables. Run the Graph with training data repeatedly using the Session (you'll want to test the model while training to make sure it's learning properly). Sounds simple enough, and once you get a hang of it, it really is! We talked about Ops in the section above, but now I want to put special emphasis on placeholders, Variables, and the Session. They are fairly easy to grasp, but getting these core fundamentals solidified will give context to learning the rest of the TensorFlow API. Placeholders A Placeholder is a node in the graph that must be fed data via the feed_dict parameter in Session.run (see below). In general, these are used to specify input data and label data. Basically, you use placeholders to tell TensorFlow, "Hey TF, the data here is going to change each time you run the graph, but it will always be a tensor of size [N] and data-type [D]. Use that information to make sure that my matrix/tensor calculations are set up properly." TensorFlow needs to have that information in order to compile the program, as it has to guarantee that you don't accidentally try to multiply a 5x5 matrix with an 8x8 matrix (amongst other things). Placeholders are easy to define. Just make a variable that is assigned to the result of tensorflow.placeholder(): import tensorflow as tf # Create a Placeholder of size 100x400 that will contain 32-bit floating point numbers my_placeholder = tf.placeholder(tf.float32, shape=(100, 400)) Read more about Placeholder objects here. Note: We are required to feed data to the placeholder when we run our graph. We'll cover this in the Session section below. Variables Variables are objects that contain tensor information but persist across multiple calls to Session.run(). That is, they contain information that can be altered during the run of a graph, and then that altered state can be accessed the next time the graph is run. Variables are used to hold the weights and biases of a machine learning model while it trains, and their final values are what define the trained model. Defining and using Variables is mostly straightforward. Define a Variable with tensorflow.Variable() and update its information with the assign() method: import tensorflow as tf # Create a variable with the value 0 and the name of 'my_variable' my_var = tf.Variable(0, name='my_variable') # Increment the variable by one my_var.assign(my_var + 1) One catch with Variable objects is that you can't run Ops with them until you initialize them within the Session object. This is usually done with the Operation returned from tf.initialize_all_variables(), as I'll describe in the next section. Variable API reference The official how-to for Variable objects The Session Finally, let's talk about running the Session. The TensorFlow Session object is in charge of keeping track of all Variables, coordinating computation across devices, and generally doing anything that involves running the graph. You generally start a Session by calling tensorflow.Session(), and either directly assign the value of that statement to a handle or use a with ... as statement. The most important method in the Session object is run(), which takes in as input fetches, a list of Operations and Tensors that the user wishes to calculate; and feed_dict, which is an optional dictionary mapping Tensors (often Placeholders) to values that should override that Tensor. This is how you specify which values you want returned from your computation as well as the input values for training. Here is a toy example that uses a placeholder, a Variable, and the Session to showcase their basic use: import tensorflow as tf # Create a placeholder for inputting floating point data later a = tf.placeholder(tf.float32) # Make a base Variable object with the starting value of 0 start = tf.Variable(0.0) # Create a node that is the value of incrementing the 'start' Variable by the value of 'a' y = start.assign(start + a) # Open up a TensorFlow Session and assign it to the handle 'sess' sess = tf.Session() # Important: initialize the Variable, or else we won't be able to run our computation init = tf.initialize_all_variables() # 'init' is an Op: must be run by sess sess.run(init) # Now the Variable is initialized! # Get the value of 'y', feeding in different values for 'a', and print the result # Because we are using a Variable, the value should change each time print(sess.run(y, feed_dict={a:1})) # Prints 1.0 print(sess.run(y, feed_dict={a:0.5})) # Prints 1.5 print(sess.run(y, feed_dict={a:2.2})) # Prints 3.7 # Close the Session sess.close() Check out the documentation for TensorFlow's Session object here. Finishing Up Alright! This primer should give you a head start on understanding more of the resources out there for TensorFlow. The less you have to think about how TensorFlow works, the more time you can spend working out how to set up the best neural network you can! Good luck, and happy flowing! About the author Sam Abrahams is a freelance data engineer and animator in Los Angeles, CA, USA. He specializes in real-world applications of machine learning and is a contributor to TensorFlow. Sam runs a small tech blog, Memdump, and is an active member of the local hacker scene in West LA.
Read more
  • 0
  • 0
  • 3326
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-web-components
Packt
17 Jun 2016
12 min read
Save for later

Web Components

Packt
17 Jun 2016
12 min read
In this article by Arshak Khachatryan, the author of Getting Started with Polymer, we will discuss web components. Currently, web technologies are growing rapidly. Though most websites use these technologies nowadays, we come across many with a bad, unresponsive UI design and awful performance. The only reason we should think about a responsive website is that users are now moving to the mobile web. 55% of the web users use mobile phones because they are faster and more comfortable. This is why we need to provide mobile content in the simplest way possible. Everything is moving to minimalism, even the Web. The new web standards are changing rapidly too. In this article, we will cover one of these new technologies, web components, and what they do. We will discuss the following specifications of web components in this article: Templates Shadow DOM (For more resources related to this topic, see here.) Templates In this section, we will discuss what we can do with templates. However, let's answer a few questions before this. What are templates, and why should we use them? Templates are basically fragments of HTML, but let's call these fragments as the "zombie" fragments of HTML as they are neither alive nor dead. What is meant by "neither alive nor dead"? Let me explain this with a real-life example. Once, when I was working on the ucraft.me project (it's a website built with a lot of cool stuff in it), we faced some rather new challenges with the templates. We had a lot of form elements, but we didn't know where to save the form elements content. We didn't want to load the DOM of each form element, but what could we do? As always, we did some magic; we created a lot of div elements with the form elements and hid it with CSS. But the CSS display: none property did not render the element, but it loaded the element. This was also a problem because there were a lot of form element templates, and it affected the performance of the website. I recommended to my team that they work with templates. Templates can contain HTML content, but they do not load the element nor render. We call template elements "dead elements" because they do not load the content until you get their content with JavaScript. Let's move ahead, and let me show you some examples of how you can create templates and do some stuff with its content. Imagine that you are working on a big project where you need to load some dynamic content without AJAX. If I had a task such as this, I would create a PHP file and get its content by calling the jQuery .load() function. However, now, you can save your content inside of the <template> element and get the content without any jQuery and AJAX but with a single line of JavaScript code. Let's create a template. In index.html, we have <template> and some content we want to get in the future, as shown in the following code block: <template class="superman"> <div> <img src="assets/img/superman.png" class="animated_superman" /> </div> </template> The time has now come for JavaScript! Execute the following code: <script> // selecting the template element with querySelector() var tmpl = document.querySelector('.superman'); //getting the <template> content var content = tmpl.content; // making some changes in the content content.querySelector('.animated_superman').width = 200; // appending the template to the body document.body.appendChild(content); </script> So, that's it! Cool, right? The content will load only after you append the content to the document. So, do you realize that templates are a part of the future web? If you are using Chrome Canary, just turn on the flags of experimental web platform features and enable HTML imports and experimental JavaScript. There are four ways to use templates, which are: Add templates with hidden elements in the document and just copy and paste the data when you need it, as follows: <div hidden data-template="superman"> <div> <p>SuperMan Head</p> <img src="assets/img/superman.png" class="animated_superman" /> </div> </div> However, the problem is that a browser will load all the content. It means that the browser will load but not render images, video, audio, and so on. Get the content of the template as a string (by requesting with AJAX or from <script type="x-template">). However, we might have some problems in working with the string. This can be dangerous for XSS attacks; we just need to pay some more attention to this: <script data-template="batman" type="x-template"> <div> <p>Batman Head this time!</p> <img src="assets/img/superman.png" class="animated_superman" /> </div> </div> Compiled templates such as Hogan.js (http://twitter.github.io/hogan.js/) work with strings. So, they have the same flaw as the patterns of the second type. Templates do not have these disadvantages. We will work with DOM and not with the strings. We will then decide when to run the code. In conclusion: The <template> tag is not intended to replace the system of standardization. There are no tricky iteration operators or data bindings. Its main feature is to be able to insert "live" content along with scripts. Lastly, it does not require any libraries. Shadow DOM The Shadow DOM specification is a separate standard. A part of it is used for standard DOM elements, but it is also used to create with web components. In this section, you will learn what Shadow DOM is and how to use it. Shadow DOM is an internal DOM element that is separated from an external document. It can store your ID, styles, and so on. Most importantly, Shadow DOM is not visible outside of its scope without the use of special techniques. Hence, there are no conflicts with the external world; it's like an iframe. Inside the browser The Shadow DOM concept has been used for a long time inside browsers themselves. When the browser shows complex controls, such as a <input type = "range"> slider or a <input type = "date"> calendar within itself, it constructs them out of the most ordinary styled <div>, <span>, and other elements. They are invisible at the first glance, but they can be easily seen if the checkbox in Chrome DevTools is set to display Shadow DOM: In the preceding code, #shadow-root is the Shadow DOM. Getting items from the Shadow DOM can only be done using special JavaScript calls or selectors. They are not children but a more powerful separation of content from the parent. In the preceding Shadow DOM, you can see a useful pseudo attribute. It is nonstandard and is present for solely historical reasons. It can be styled via CSS with the help of subelements—for example, let's change the form input dates to red via the following code: <style> input::-webkit-datetime-edit { background: red; } </style> <input type="date" /> Once again, make a note of the pseudo custom attribute. Speaking chronologically, in the beginning, the browsers started to experiment with encapsulated DOM structure inside their scopes, then Shadow DOM appeared which allowed developers to do the same. Now, let's work with the Shadow DOM from JavaScript or the standard Shadow DOM. Creating a Shadow DOM The Shadow DOM can create any element within the elem.createShadowRoot() call, as shown by the following code: <div id="container">You know why?</div> <script> var root = container.createShadowRoot(); root.innerHTML = "Because I'm Batman!"; </script> If you run this example, you will see that the contents of the #container element disappeared somewhere, and it only shows "Because I'm Batman!". This is because the element has a Shadow DOM and ignores the previous content of the element. Because of the creation of Shadow DOM, instead of the content, the browser has shown only the Shadow DOM. If you wish, you can put the contents of the ordinary inside this Shadow DOM. To do this, you need to specify where it is to be done. The Shadow DOM is done through the "insertion point", and it is declared using the <content> tag; here's an example: <div id="container">You know why?</div> <script> var root = container.createShadowRoot(); root.innerHTML = '<h1><content></content></h1><p>Winter is coming!</p>'; </script> Now, you will see "You know why?" in the title followed by "Winter is coming!". Here's a Shadow DOM example in Chrome DevTool: The following are some important details about the Shadow DOM: The <content> tag affects only the display, and it does not move the nodes physically. As you can see in the preceding picture, the node "You know why?" remained inside the div#container. It can even be obtained using container.firstElementChild. Inside the <content> tag, we have the content of the element itself. In this example string "You know why?". With the select attribute of the <content> element, you can specify a particular selector content you want to transfer; for example, <content select="p"></content> will transfer only paragraphs. Inside the Shadow DOM, you can use the <content> tag multiple times with different values of select, thus indicating where to place which part of the original content. However, it is impossible to duplicate nodes. If the node is shown in a <content> tag, then the next node will be missed. For example, if there is a <content select="h3.title"> tag and then <content select= "h3">, the first <content> will show the headers <h3> with the class title, while the second will show all the others, except for the ones already shown. In the preceding example from DevTools, the <content></content> tag is empty. If we add some content in the <content> tag, it will show that in that case, if there are no other nodes. Check out the following code: <div id="container"> <h3>Once upon a time, in Westeros</h3> <strong>Ruled a king by name Joffrey and he's dead!</strong> </div> <script> var root = container.createShadowRoot(); root.innerHTML = '<content select='h3'></content> <content select=".writer"> Jon Snow </content> <content></content>'; </script> When you run the JS code, you will see the following: The first <content select='h3'> tag will display the title The second <content select = ".hero"> tag would show the hero name, but if there's no any element with this selector, it will take the default value: <content select=".hero"> The third <content> tag displays the rest of the original contents of the elements without the header <h3>, which it had launched earlier Once again, note that <content> moves nodes on the DOM physically. Root shadowRoot After the creation of a root in the internal DOM, the tree will be available as container.shadowRoot. It is a special object that supports the basic methods of CSS requests and is described in detail in ShadowRoot. You need to go through container.shadowRoot if you need to work with content in the Shadow DOM. You can create a new Shadow DOM tree of JavaScript; here's an example: <div id="container">Polycasts</div> <script> // create a new Shadow DOM tree for element var root = container.createShadowRoot(); root.innerHTML = '<h1><content></content></h1> <strong>Hey googlers! Let's code today.</strong>'; </script> <script> // read data from Shadow DOM for elem var root = container.shadowRoot; // Hey googlers! Let's code today. document.write('<br/><em>container: ' + root. querySelector('strong').innerHTML); // empty as physical nodes - is content document.write('<br/><em>content: ' + root. querySelector('content').innerHTML); </script> To finish up, Shadow DOM is a tool to create a separate DOM tree inside the cell, which is not visible from outside without using special techniques: A lot of browser components with complex structures have Shadow DOM already. You can create Shadow DOM inside every element by calling elem.createShadowRoot(). In the future, it will be available as a elem.shadowRoot root, and you can access it inside the Shadow DOM. It is not available for custom elements. Once the Shadow DOM appears in the element, the content of it is hidden. You can see just the Shadow DOM. The <content> element moves the contents of the original item in the Shadow DOM only visually. However, it remains in the same place in the DOM structure. Detailed specifications are given at http://w3c.github.io/webcomponents/spec/shadow/. Summary Using web components, you can easily create your web application by splitting it into parts/components. Resources for Article: Further resources on this subject: Handling the DOM in Dart [article] Manipulation of DOM Objects using Firebug [article] jQuery 1.4 DOM Manipulation Methods for Style Properties and Class Attributes [article]
Read more
  • 0
  • 0
  • 13341

article-image-capability-model-microservices
Packt
17 Jun 2016
19 min read
Save for later

A capability model for microservices

Packt
17 Jun 2016
19 min read
In this article by Rajesh RV, the author of Spring Microservices, you will learn aboutthe concepts of microservices. More than sticking to definitions, it is better to understand microservices by examining some common characteristics of microservices that are seen across many successful microservices implementations. Spring Boot is an ideal framework to implement microservices. In this article, we will examine how to implement microservices using Spring Boot with an example use case. Beyond services, we will have to be aware of the challenges around microservices implementation. This article will also talk about some of the common challenges around microservices. A successful microservices implementation has to have some set of common capabilities. In this article, we will establish a microservices capability model that can be used in a technology-neutral framework to implement large-scale microservices. What are microservices? Microservices is an architecture style used by many organizations today as a game changer to achieve a high degree of agility, speed of delivery, and scale. Microservices give us a way to develop more physically separated modular applications. Microservices are not invented. Many organizations, such as Netflix, Amazon, and eBay, successfully used the divide-and-conquer technique to functionally partition their monolithic applications into smaller atomic units, and each performs a single function. These organizations solved a number of prevailing issues they experienced with their monolithic application. Following the success of these organizations, many other organizations started adopting this as a common pattern to refactor their monolithic applications. Later, evangelists termed this pattern microservices architecture. Microservices originated from the idea of Hexagonal Architecture coined by Alister Cockburn. Hexagonal Architecture is also known as thePorts and Adapters pattern. Microservices is an architectural style or an approach to building IT systems as aset of business capabilities that are autonomous, self-contained, and loosely coupled. The preceding diagram depicts a traditional N-tier application architecture having a presentation layer, business layer, and database layer. Modules A, B, and C represents three different business capabilities. The layers in the diagram represent a separation of architecture concerns. Each layer holds all three business capabilities pertaining to this layer. The presentation layer has the web components of all three modules, the business layer has the business components of all the three modules, and the database layer hosts tables of all the three modules. In most cases, layers are physically spreadable, whereas modules within a layer are hardwired. Let's now examine a microservices-based architecture, as follows: As we can note in the diagram, the boundaries are inverted in the microservices architecture. Each vertical slice represents a microservice. Each microservice has its own presentation layer, business layer, and database layer. Microservices are aligned toward business capabilities. By doing so, changes to one microservice do not impact others. There is no standard for communication or transport mechanisms for microservices. In general, microservices communicate with each other using widely adopted lightweight protocols such as HTTP and REST or messaging protocols such as JMS or AMQP. In specific cases, one might choose more optimized communication protocols such as Thrift, ZeroMQ, Protocol Buffers, or Avro. As microservices are more aligned to the business capabilities and have independently manageable lifecycles, they are the ideal choice for enterprises embarking on DevOps and cloud. DevOps and cloud are two other facets of microservices. Microservices are self-contained, independently deployable, and autonomous services that take full responsibility of a business capability and its execution. They bundle all dependencies, including library dependencies and execution environments such as web servers and containers or virtual machines that abstract physical resources. These self-contained services assume single responsibility and are well enclosed with in a bounded context. Microservices – The honeycomb analogy The honeycomb is an ideal analogy to represent the evolutionary microservices architecture. In the real world, bees build a honeycomb by aligning hexagonal wax cells. They start small, using different materials to build the cells. Construction is based on what is available at the time of building. Repetitive cells form a pattern and result in a strong fabric structure. Each cell in the honeycomb is independent but also integrated with other cells. By adding new cells, the honeycomb grows organically to a big solid structure. The content inside each cell is abstracted and is not visible outside. Damage to one cell does not damage other cells, and bees can reconstruct these cells without impacting the overall honeycomb. Characteristics of microservices The microservices definition discussed at the beginning of this article is arbitrary. Evangelists and practitioners have strong but sometimes different opinions on microservices. There is no single, concrete, and universally accepted definition for microservices. However, all successful microservices implementations exhibit a number of common characteristics. Some of these characteristics are explained as follows: Since microservices are more or less similar to a flavor of SOA, many of the service characteristics of SOA are applicable to microservices, as well. In the microservices world, services are first-class citizens. Microservices expose service endpoints as APIs and abstract all their realization details. The APIs could be synchronous or asynchronous. HTTP/REST is the popular choice for APIs. As microservices are autonomous and abstract everything behind service APIs, it is possible to have different architectures for different microservices. The internal implementation logic, architecture, and technologies, including programming language, database, quality of service mechanisms,and so on, are completely hidden behind the service API. Well-designed microservices are aligned to a single business capability, so they perform only one function. As a result, one of the common characteristics we see in most of the implementations are microservices with smaller footprints. Most of the microservices implementations are automated to the maximum extent possible, from development to production. Most large-scale microservices implementations have a supporting ecosystem in place. The ecosystem's capabilities include DevOps processes, centralized log management, service registry, API gateways, extensive monitoring, service routing and flow control mechanisms, and so on. Successful microservices implementations encapsulate logic and data within the service. This results in two unconventional situations: a distributed data and logic and decentralized governance. A microservice example The Customer profile microservice exampleexplained here demonstrates the implementation of microservice and interaction between different microservices. In this example, two microservices, Customer Profile and Customer Notification, will be developed. As shown in the diagram, the Customer Profile microservice exposes methods to create, read, update, and delete a customer and a registration service to register a customer. The registration process applies a certain business logic, saves the customer profile, and sends a message to the CustomerNotification microservice. The CustomerNotification microservice accepts the message send by the registration service and sends an e-mail message to the customer using an SMTP server. Asynchronous messaging is used to integrate CustomerProfile with the CustomerNotification service. The customer microservices class domain model diagram is as shown here: Implementing this Customer Profile microservice is not a big deal. The Spring framework, together with Spring Boot, provides all the necessary capabilities to implement this microservice without much hassle. The key is CustomerController in the diagram, which exposes the REST endpoint for our microservice. It is also possible to use HATEOAS to explore the repository's REST services directly using the @RepositoryRestResource annotation. The following code sample shows the Spring Boot main class called Application and theREST endpoint definition for theregistration of a new customer: @SpringBootApplication public class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args);  } }   @RestController class CustomerController{ //other code here @RequestMapping( path="/register", method = RequestMethod.POST) Customer register(@RequestBody Customer customer){ returncustomerComponent.register(customer); } } CustomerControllerinvokes a component class, CustomerComponent. The component class/bean handles all the business logic. CustomerRepository is a Spring data JPA repository defined to handlethe persistence of the Customer entity. The whole application will then be deployed as a Spring Boot application by building a standalone jar rather than using the conventional war file. Spring Boot encapsulates the server runtime along with the fat jar it produces. By default, it is an instance of the Tomcat server. CustomerComponent, in addition to calling the CustomerRepository class, sends a message to the RabbitMQ queue, where the CustomerNotification component is listening. This can be easily achieved in Spring using the RabbitMessagingTemplate class as shown in the following Sender implementation: @Component class CustomerComponent { //other code here   Customer register(Customer customer){ customerRespository.save(customer); sender.send(customer.getEmail()); return customer; } }   @Component @Lazy class Sender { RabbitMessagingTemplate template;   @Autowired Sender(RabbitMessagingTemplate template){ this.template = template; }   @Bean Queue queue() { return new Queue("CustomerQ", false); }   public void send(String message){ template.convertAndSend("CustomerQ", message); } } The receiver on the other sideconsumes the message using RabbitListener and sends out an e-mail using theJavaMailSender component. Execute the following code: @Component class Receiver { @Autowired private  JavaMailSenderjavaMailService;   @Bean Queue queue() { return new Queue("CustomerQ", false); }   @RabbitListener(queues = "CustomerQ") public void processMessage(String email) { System.out.println(email); SimpleMailMessagemailMessage=new SimpleMailMessage(); mailMessage.setTo(email); mailMessage.setSubject("Registration"); mailMessage.setText("Successfully Registered"); javaMailService.send(mailMessage);       }   } In this case,CustomerNotification isour secondSpring Boot microservice. In this case, instead of the REST endpoint, it only exposes a message listener end point. Microservices challenges In the previous section,you learned about the right design decisions to be made and the trade-offs to be applied. In this section, we will review some of the challenges with microservices. Take a look at the following list: Data islands: Microservices abstract their own local transactional store, which is used for their own transactional purposes. The type of store and the data structure will be optimized for the services offered by the microservice. This can lead to data islands and, hence, challenges around aggregating data from different transactional stores to derive meaningful information. Logging and monitoring: Log files are a good piece of information for analysis and debugging. As each microservice is deployed independently, they emit separate logs, maybe to a local disk. This will result in fragmented logs. When we scale services across multiple machines, each service instance would produce separate log files. This makes it extremely difficult to debug and understand the behavior of the services through log mining. Dependency management: Dependency management is one of the key issues in large microservices deployments. How do we ensure the chattiness between services is manageable?How do we identify and reduce the impact of a change? How do we know whether all the dependent services are up and running? How will the service behave if one of the dependent services is not available? Organization's culture: One of the biggest challenges in microservices implementation is the organization's culture. An organization following waterfall development or heavyweight release management processes with infrequent release cycles is a challenge for microservices development. Insufficient automation is also a challenge for microservices deployments. Governance challenges: Microservices impose decentralized governance, and this is quite in contrast to the traditional SOA governance. Organizations may find it hard to come up with this change, and this could negatively impact microservices development. How can we know who is consuming service? How do we ensure service reuse? How do we define which services are available in the organization? How do we ensure that the enterprise polices are enforced? Operation overheads: Microservices deployments generally increases the number of deployable units and virtual machines (or containers). This adds significant management overheads and cost of operations.With a single application, a dedicated number of containers or virtual machines in an on-premises data center may not make much sense unless the business benefit is high. With many microservices, the number of Configurable Items (CIs) is too high, and the number of servers in which these CIs are deployed might also be unpredictable. This makes it extremely difficult to manage data in a traditional Configuration Management Database (CMDB). Testing microservices: Microservices also pose a challenge for the testability of services. In order to achieve full service functionality, one service may rely on another service, and this, in turn, may rely on another service, either synchronously or asynchronously. The issue is how we test an end-to-end service to evaluate its behavior. Dependent services may or may not be available at the time of testing. Infrastructure provisioning: As briefly touched upon under operation overheads, manual deployment can severely challenge microservices rollouts. If a deployment has manual elements, the deployer or operational administrators should know the running topology, manually reroute traffic, and then deploy the application one by one until all the services are upgraded. With many server instances running, this could lead to significant operational overheads. Moreover, the chance of error is high in this manual approach. Beyond just services– The microservices capability model Microservice are not as simple as the Customer Profile implementation we discussedearlier. This is specifically true when deploying hundreds or thousands of services. In many cases, an improper microservices implementation may lead to a number of challenges, as mentioned before.Any successful Internet-scale microservices deployment requires a number of additional surrounding capabilities. The following diagram depicts the microservices capability model: The capability model is broadly classified in to four areas, as follows: Core capabilities, which are part of the microservices themselves Supporting capabilities, which are software solutions supporting core microservice implementations Infrastructure capabilities, which are infrastructure-level expectations for a successful microservices implementation Governance capabilities, which are more of process, people, and reference information Core capabilities The core capabilities are explained here: Service listeners (HTTP/Messaging): If microservices are enabled for HTTP-based service endpoints, then the HTTP listener will be embedded within the microservices, thereby eliminating the need to have any external application server requirement. The HTTP listener will be started at the time of the application startup. If the microservice is based on asynchronous communication, then instead of an HTTP listener, a message listener will be stated. Optionally, other protocols could also be considered. There may not be any listeners if the microservices is a scheduled service. Spring Boot and Spring Cloud Streams provide this capability. Storage capability: Microservices have storage mechanisms to store state or transactional data pertaining to the business capability. This is optional, depending on the capabilities that are implemented. The storage could be either a physical storage (RDBMS,such as MySQL, and NoSQL,such as Hadoop, Cassandra, Neo4J, Elasticsearch,and so on), or it could be an in-memory store (cache,such as Ehcache and Data grids,such as Hazelcast, Infinispan,and so on). Business capability definition: This is the core of microservices, in which the business logic is implemented. This could be implemented in any applicable language, such as Java, Scala, Conjure, Erlang, and so on. All required business logic to fulfil the function is embedded within the microservices itself. Event sourcing: Microservices send out state changes to the external world without really worrying about the targeted consumers of these events. They could be consumed by other microservices, supporting services such as audit by replication, external applications,and so on. This will allow other microservices and applications to respond to state changes. Service endpoints and communication protocols: This defines the APIs for external consumers to consume. These could be synchronous endpoints or asynchronous endpoints. Synchronous endpoints could be based on REST/JSON or other protocols such as Avro, Thrift, protocol buffers, and so on. Asynchronous endpoints will be through Spring Cloud Streams backed by RabbitMQ or any other messaging servers or other messaging style implementations, such as Zero MQ. The API gateway: The API gateway provides a level of indirection by either proxying service endpoints or composing multiple service endpoints. The API gateway is also useful for policy enforcements. It may also provide real-time load balancing capabilities. There are many API gateways available in the market. Spring Cloud Zuul, Mashery, Apigee, and 3 Scale are some examples of API gateway providers. User interfaces: Generally, user interfaces are also part of microservices for users to interact with the business capabilities realized by the microservices. These could be implemented in any technology and is channel and device agnostic. Infrastructure capabilities Certain infrastructure capabilities are required for a successful deployment and to manage large-scale microservices. When deploying microservices at scale, not having proper infrastructure capabilities can be challenging and can lead to failures. Cloud: Microservices implementation is difficult in a traditional data center environment with a long lead time to provision infrastructures. Even a large number of infrastructure dedicated per microservice may not be very cost effective. Managing them internally in a data center may increase the cost of ownership and of operations. A cloud-like infrastructure is better for microservices deployment. Containers or virtual machines: Managing large physical machines is not cost effective and is also hard to manage. With physical machines, it is also hard to handle automatic fault tolerance. Virtualization is adopted by many organizations because of its ability to provide an optimal use of physical resources, and it provides resource isolation. It also reduces the overheads in managing large physical infrastructure components. Containers are the next generation of virtual machines. VMWare, Citrix,and so on provide virtual machine technologies. Docker, Drawbridge, Rocket, and LXD are some containerizing technologies. Cluster control and provisioning: Once we have a large number of containers or virtual machines, it is hard to manage and maintain them automatically. Cluster control tools provide a uniform operating environment on top of the containers and share the available capacity across multiple services. Apache Mesos and Kubernetes are examples of cluster control systems. Application lifecycle management: Application lifecycle management tools help to invoke applications when a new container is launched or kill the application when the container shuts down. Application lifecycle management allows to script application deployments and releases. It automatically detects failure scenarios and responds to them, thereby ensuring the availability of the application. This works in conjunction with the cluster control software. Marathon partially address this capability. Supporting capabilities Supporting capabilities are not directly linked to microservices, but these are essential for large-scale microservices development. Software-defined load balancer: The load balancer should be smart enough to understand the changes in deployment topology and respond accordingly. This moves away from the traditional approach of configuring static IP addresses, domain aliases, or cluster address in the load balancer. When new servers are added to the environment, it should automatically detect this and include them in the logical cluster by avoiding any manual interactions. Similarly, if a service instance is unavailable, it should take it out of the load balancer. A combination of Ribbon, Eureka, and Zuul provides this capability in Spring Cloud Netflix. Central log management: As explored earlier in this article, a capability is required to centralize all the logs emitted by service instances with correlation IDs. This helps debug, identify performances bottlenecks, and in predictive analysis. The result of this could feedback into the lifecycle manager to take corrective actions. Service registry: A service registry provides a runtime environment for services to automatically publish their availability at runtime. A registry will be a good source of information to understand the services topology at any point. Eureka from Spring Cloud, ZooKeeper, and Etcd are some of the service registry tools available. Security service: The distributed microservices ecosystem requires a central server to manage service security. This includes service authentication and token services. OAuth2-based services are widely used for microservices security. Spring Security and Spring Security OAuth are good candidates to build this capability. Service configuration: All service configurations should be externalized, as discussed in the Twelve-Factor application principles. A central service for all configurations could be a good choice. The Spring Cloud Config server and Archaius are out-of-the-box configuration servers. Testing tools (Anti-Fragile, RUM, and so on): Netflix uses Simian Army for antifragile testing. Mature services need consistent challenges to see the reliability of the services and how good fallback mechanisms are. Simian Army components create various error scenarios to explore the behavior of the system under failure scenarios. Monitoring and dashboards: Microservices also require a strong monitoring mechanism. This monitoring is not just at the infrastructurelevel but also at the service level. Spring Cloud Netflix Turbine, the Hysterix dashboard,and others provide service-level information. End-to-end monitoring tools,such as AppDynamic, NewRelic, Dynatrace, and other tools such as Statd, Sensu, and Spigo, could add value in microservices monitoring. Dependency and CI management: We also need tools to discover runtime topologies, to find service dependencies, and to manage configurable items (CIs). A graph-based CMDB is more obvious to manage these scenarios. Data lakes: As discussed earlier in this article, we need a mechanism to combine data stored in different microservices and perform near real-time analytics. Data lakesare a good choice to achieve this. Data ingestion tools such as Spring Cloud Data Flow, Flume, and Kafka are used to consume data. HDFS, Cassandra,and others are used to store data. Reliable messaging: If the communication is asynchronous, we may need a reliable messaging infrastructure service, such as RabbitMQ or any other reliable messaging service. Cloud messaging or messaging as service is a popular choice in Internet-scale message-based service endpoints. Process and governance capabilities The last in the puzzle are the process and governance capabilities required for microservices, which are: DevOps: Key in successful implementation is to adopt DevOps. DevOps complements microservices development by supporting agile development, high-velocity delivery, automation, and better change management. DevOps tools: DevOps tools for agile development, continuous integration, continuous delivery, and continuous deployment are essential for a successful delivery of microservices. A lot of emphasis is required in automated, functional, and real user testing as well as synthetic, integration, release, and performance testing. Microservices repository: A microservices repository is where the versioned binaries of microservices are placed. These could be a simple Nexus repository or container repositories such as the Docker registry. Microservice documentation: It is important to have all microservices properly documented. Swagger or API blueprint are helpful in achieving good microservices documentation. Reference architecture and libraries: Reference architecture provides a blueprint at the organization level to ensure that services are developed according to certain standards and guidelines in a consistent manner. Many of these could then be translated to a number of reusable libraries that enforce service development philosophies. Summary In this article,you learned the concepts and characteristics of microservices. We took as example a holiday portal to understand the concept of microservices better. We also examined some of the common challenges in large-scale microservice implementation. Finally, we established a microservices capability model in this article that can be used to deliver successful Internet-scale microservices.
Read more
  • 0
  • 0
  • 31235

article-image-use-sqlite-ionic-store-data
Oli Huggins
13 Jun 2016
10 min read
Save for later

How to use SQLite with Ionic to store data?

Oli Huggins
13 Jun 2016
10 min read
Hybrid Mobile apps have a challenging task of being as performant as native apps, but I always tell other developers that it depends on not the technology but how we code. The Ionic Framework is a popular hybrid app development library, which uses optimal design patterns to create awe-inspiring mobile experiences. We cannot exactly use web design patterns to create hybrid mobile apps. The task of storing data locally on a device is one such capability, which can make or break the performance of your app. In a web app, we may use localStorage to store data but mobile apps require much more data to be stored and swift access. Localstorage is synchronous, so it acts slow in accessing the data. Also, web developers who have experience of coding in a backend language such as C#, PHP, or Java would find it more convenient to access data using SQL queries than using object-based DB. SQLite is a lightweight embedded relational DBMS used in web browsers and web views for hybrid mobile apps. It is similar to the HTML5 WebSQL API and is asynchronous in nature, so it does not block the DOM or any other JS code. Ionic apps can leverage this tool using an open source Cordova Plugin by Chris Brody (@brodybits). We can use this plugin directly or use it with the ngCordova library by the Ionic team, which abstracts Cordova plugin calls into AngularJS-based services. We will create an Ionic app in this blog post to create Trackers to track any information by storing it at any point in time. We can use this data to analyze the information and draw it on charts. We will be using the ‘cordova-sqlite-ext’ plugin and the ngCordova library. We will start by creating a new Ionic app with a blank starter template using the Ionic CLI command, ‘$ ionic start sqlite-sample blank’. We should also add appropriate platforms for which we want to build our app. The command to add a specific platform is ‘$ ionic platform add <platform_name>’. Since we will be using ngCordova to manage SQLite plugin from the Ionic app, we have to now install ngCordova to our app. Run the following bower command to download ngCordova dependencies to the local bower ‘lib’ folder: bower install ngCordova We need to inject the JS file using a script tag in our index.html: <script src=“lib/ngCordova/dist/ng-cordova.js"></script> Also, we need to include the ngCordova module as a dependency in our app.js main module declaration: angular.module('starter', [‘ionic’,’ngCordova']) Now, we need to add the Cordova plugin for SQLite using the CLI command: cordova plugin add https://github.com/litehelpers/Cordova-sqlite-storage.git Since we will be using the $cordovaSQLite service of ngCordova only to access this plugin from our Ionic app, we need not inject any other plugin. We will have the following two views in our Ionic app: Trackers list: This list shows all the trackers we add to DB Tracker details: This is a view to show list of data entries we make for a specific tracker We would need to create the routes by registering the states for the two views we want to create. We need to add the following config block code for our ‘starter’ module in the app.js file only: .config(function($stateProvider,$urlRouterProvider){ $urlRouterProvider.otherwise('/') $stateProvider.state('home', { url: '/', controller:'TrackersListCtrl', templateUrl: 'js/trackers-list/template.html' }); $stateProvider.state('tracker', { url: '/tracker/:id', controller:'TrackerDetailsCtrl', templateUrl: 'js/tracker-details/template.html' }) }); Both views would have similar functionality, but will display different entities. Our view will display a list of trackers from the SQLite DB table and also provide a feature to add a new tracker or delete an existing one. Create a new folder named trackers-list where we can store our controller and template for the view. We will also abstract our code to access the SQLite DB into an Ionic factory. We will implement the following methods: initDB: This will initialize or create a table for this entity if it does not exist getAllTrackers: This will get all trackers list rows from the created table addNewTracker - This is a method to insert a new row for a new tracker into the table deleteTracker - This is a method to delete a specific tracker using its ID getTracker - This will get a specific Tracker from the cached list using an ID to display anywhere We will be injecting the $cordovaSQLite service into our factory to interact with our SQLite DB. We can open an existing DB or create a new DB using the command $cordovaSQLite.openDB(“myApp.db”). We have to store the object reference returned from this method call, so we will store it in a module-level variable called db. We have to pass this object reference to our future $cordovaSQLite service calls. $cordovaSQLite has a handful of methods to provide varying features: openDB: This is a method to establish a connection to the existing DB or create a new DB execute: This is a method to execute a single SQL command query insertCollection: This is a method to insert bulk values nestedExecute: This is a method to run nested queries deleted: This is a method to delete a particular DB We see the usage of openDB and execute the command in this post. In our factory, we will create a standard method runQuery to adhere to DRY(Don’t Repeat Yourself) principles. The code for the runQuery function is as follows: function runQuery(query,dataParams,successCb,errorCb) { $ionicPlatform.ready(function() { $cordovaSQLite.execute(db, query,dataParams).then(function(res) { successCb(res); }, function (err) { errorCb(err); }); }.bind(this)); } In the preceding code, we pass the query as a string, dataParams (dynamic query parameters) as an array, and successCB/errorCB as callback functions. We should always ensure that any Cordova plugin code should be called when the Cordova ready event is already fired, which is ensured by the $ionicPlatform.ready() method. We will then call the execute method of the $cordovaSQLite service passing the ‘db’ object reference, query, and dataParams as arguments. The method returns a promise to which we register callbacks using the ‘.then’ method. We pass the results or error using the success callback or error callback. Now, we will write code for each of the methods to initialize DB, insert a new row, fetch all rows, and then delete a row. initDB Method: function initDB() { db = $cordovaSQLite.openDB("myapp.db"); var query = "CREATE TABLE IF NOT EXISTS trackers_list (id in-teger autoincrement primary key, name string)"; runQuery(query,[],function(res) { console.log("table created "); }, function (err) { console.log(err); }); } In the preceding code, the openDB method is used to establish a connection with an existing DB or create a new DB. Then, we run the query to create a new table called ‘trackers_list’ if it does not exist. We define the columns ID with integer autoincrement primary key properties with the name string. addNewTracker Method: function addNewTracker(name) { var deferred = $q.defer(); var query = "INSERT INTO trackers_list (name) VALUES (?)"; runQuery(query,[name],function(response){ //Success Callback console.log(response); deferred.resolve(response); },function(error){ //Error Callback console.log(error); deferred.reject(error); }); return deferred.promise; } In the preceding code, we take ‘name’ as an argument, which will be passed into the insert query. We write the insert query and add a new row to the trackers_list table where ID will be auto-generated. We pass dynamic query parameters using the ‘?’ character in our query string, which will be replaced by elements in the dataParams array passed as the second argument to the runQuery method. We also use a $q library to return a promise to our factory methods so that controllers can manage asynchronous calls. getAllTrackers Method: This method is the same as the addNewTracker method, only without the name parameter, and it has the following query: var query = "SELECT * from trackers_list”; This method will return a promise, which when resolved will give the response from the $cordovaSQLite method. The response object will have the following structure: { insertId: <specific_id>, rows: {item: function, length: <total_no_of_rows>} rowsAffected: 0 } The response object has properties insertId representing the new ID generated for the row, rowsAffected giving the number of rows affected by the query and rows object with item method property, to which we can pass the index of the row to retrieve it. We will write the following code in the controller to convert the response.rows object into an utterable array of rows to be displayed using the ng-repeat directive: for(var i=0;i<response.rows.length;i++) { $scope.trackersList.push({ id:response.rows.item(i).id, name:response.rows.item(i).name }); } The code in the template to display the list of Trackers would be as follows: <ion-item ui-sref="tracker({id:tracker.id})" class="item-icon-right" ng-repeat="tracker in trackersList track by $index"> {{tracker.name}} <ion-delete-button class="ion-minus-circled" ng-click=“deleteTracker($index,tracker.id)"> </ion-delete-button> <i class="icon ion-chevron-right”></i> </ion-item> deleteTracker Method: function deleteTracker(id) { var deferred = $q.defer(); var query = "DELETE FROM trackers_list WHERE id = ?"; runQuery(query,[id],function(response){ … [Same Code as addNewTrackerMethod] } The delete tracker method has the same code as the addNewTracker method, where the only change is in the query and the argument passed. We pass ‘id’ as the argument to be used in the WHERE clause of delete query to delete the row with that specific ID. Rest of the Ionic App Code: The rest of the app code has not been discussed in this post because we have already discussed the code that is intended for integration with SQLite. You can implement your own version of this app or even use this sample code for any other use case. The trackers details view will be implemented in the same way to store data into the tracker_entries table with a foreign key, tracker_id, used for this table. It will also use this ID in the SELECT query to fetch entries for a specific tracker on its detail view. The GitHub link for the exact functioning code for complete app developed during this tutorial. About the author Rahat Khanna is a techno nerd experienced in developing web and mobile apps for many international MNCs and start-ups. He has completed his bachelors in technology with computer science and engineering as specialization. During the last 7 years, he has worked for a multinational IT service company and ran his own entrepreneurial venture in his early twenties. He has worked on projects ranging from static HTML websites to scalable web applications and engaging mobile apps. Along with his current job as a senior UI developer at Flipkart, a billion dollar e-commerce firm, he now blogs on the latest technology frameworks on sites such as www.airpair.com, appsonmob.com, and so on, and delivers talks at community events. He has been helping individual developers and start-ups in their Ionic projects to deliver amazing mobile apps.
Read more
  • 0
  • 0
  • 25068

article-image-designing-simple-robust-object-detector-and-classifier
Packt
13 Jun 2016
19 min read
Save for later

Designing a Simple, Robust Object Detector and Classifier

Packt
13 Jun 2016
19 min read
In this article by Joseph Howse, author of the book, iOS Application Development with OpenCV 3, illustrates a scale-invariant,rotation-invariant approach to object detection and classification, using OpenCV 3and just 250 lines of custom C++ code. The technique relies on blob detection, histogram analysis, and SURF (or ORB if SURF is unavailable).The classifier is sensitive to colors as well as keypoints, and itcan work with a small number of training images. For background information, sample images, and a complete tutorial on how to integrate this detector and classifier into an iOS application, refer toChapter 5, Classifying Coins and Commodities in the book,iOS Application Development with OpenCV 3 (Packt Publishing, 2016). You could also use this article's C++ code on other platforms besides iOS. (For more resources related to this topic, see here.) Defining blobs and a blob detector For our purposes, a blob simply has an image and a label. The image is cv::Mat and the label is an unsigned integer. The label's default value is 0, which shall signify that the blob has not yet been classified. Create a new header file, Blob.h, and fill it with the following declaration of a Blob class: #ifndef BLOB_H #define BLOB_H   #include <opencv2/core.hpp>   class Blob { public:   Blob(const cv::Mat &mat, uint32_t label = 0ul);     /**    * Construct an empty blob.    */   Blob();     /**    * Construct a blob by copying another blob.    */   Blob(const Blob &other);     bool isEmpty() const;     uint32_t getLabel() const;   void setLabel(uint32_t value);     const cv::Mat &getMat() const;   int getWidth() const;   int getHeight() const;   private:   uint32_t label;     cv::Mat mat; };   #endif // BLOB_H A Blob's image does not change after construction, but the label may change as a result of our classification process. Note that most of Blob's methods have the const modifier, but of course,setLabel does not because it changes the label. Now, let's declare a BlobDetector class in another new header file, BlobDetector.h. This class provides a detect public method to analyze a given image and populate vector<Blob> based on detected objects in the image. Another public method, getMask, returns a thresholded version of the most recent image that the detect method received. Internally, BlobDetector uses several more matrices and vectors to hold intermediate results, including the mask, detected edges, detected contours, and hierarchy that describes the contours' relationship to each other. Here is the detector's declaration: class BlobDetector { public:   void detect(cv::Mat &image, std::vector<Blob>&blob,     double resizeFactor = 1.0, bool draw = false);     const cv::Mat &getMask() const;   private:   void createMask(const cv::Mat &image);     cv::Mat resizedImage;   cv::Mat mask;   cv::Mat edges;   std::vector<std::vector<cv::Point>> contours;   std::vector<cv::Vec4i> hierarchy; };   #endif // !BLOB_DETECTOR_H Later, in the Detecting blobs against a plain background section, we will define the methods' bodies in new files called Blob.cpp and BlobDetector.cpp. Defining blob descriptors and a blob classifier If you are familiar with keypoint matching, you know that a keypoint has a descriptor or set of descriptive statistics. Similarly, we can define a custom descriptor for a blob. As our classifier relies on histogram comparison and keypoint matching, let's say that a blob's descriptor consists of a normalized histogram and matrix of keypoint descriptors. The descriptor object is also a convenient place to put the label. Create a new header file, BlobDescriptor.h, and put the following declaration of a BlobDescriptor class in it: #ifndef BLOB_DESCRIPTOR_H #define BLOB_DESCRIPTOR_H   #include <opencv2/core.hpp>   class BlobDescriptor { public:   BlobDescriptor(const cv::Mat &normalizedHistogram,     const cv::Mat &keypointDescriptors, uint32_t label);     const cv::Mat &getNormalizedHistogram() const;   const cv::Mat &getKeypointDescriptors() const;   uint32_t getLabel() const;   private:   cv::Mat normalizedHistogram;   cv::Mat keypointDescriptors;   uint32_t label; };   #endif // !BLOB_DESCRIPTOR_H Note that BlobDescriptor is designed as an immutable class. All its methods, except the constructor, have the const modifier. Now, let's declare a BlobClassifier class in another new header file, BlobClassifier.h. Publicly, this class receives Blob objects via an update method (for reference blobs) and a classify method (for blobs that the detector found in the scene). Privately, BlobClassifier creates, owns, and compares BlobDescriptor objects that pertain to the Blob objects. Thus, BlobClassifier is the only part of our program that needs to deal with BlobDescriptor. BlobClassifier also owns instances of OpenCV classes that are responsible for keypoint detection, description, and matching. Here is our classifier's declaration: #ifndef BLOB_CLASSIFIER_H #define BLOB_CLASSIFIER_H   #import "Blob.h" #import "BlobDescriptor.h"   #include <opencv2/features2d.hpp>   class BlobClassifier { public:   BlobClassifier();     /**    * Add a reference blob to the classification model.    */   void update(const Blob &referenceBlob);     /**    * Clear the classification model.    */   void clear();     /**    * Classify a blob that was detected in a scene.    */   void classify(Blob &detectedBlob) const;   private:   BlobDescriptor createBlobDescriptor(const Blob &blob) const;   float findDistance(const BlobDescriptor &detectedBlobDescriptor,     const BlobDescriptor &referenceBlobDescriptor) const;     /**    * A feature detector and descriptor extractor.    * It finds features in images.    * Then, it creates descriptors of the features.    */   cv::Ptr<cv::Feature2D> featureDetectorAndDescriptorExtractor;     /**    * A descriptor matcher.    * It matches features based on their descriptors.    */   cv::Ptr<cv::DescriptorMatcher> descriptorMatcher;     /**    * Descriptors of the reference blobs.    */   std::vector<BlobDescriptor> referenceBlobDescriptors; };   #endif // !BLOB_CLASSIFIER_H Later, in the Classifying blobs by color and keypoints section, we will write the methods' bodies in new files called BlobDescriptor.cpp and BlobClassifier.cpp. Detecting blobs against a plain background Let's assume that the background has a distinctive color range, such as "cream to snow white". Our blob detector will calculate the image's dominant color range and search for large regions whose colors differ from this range. These anomalous regions will constitute the detected blobs. For small objects such as a bean or coin, a user can easily find a plain background such as a blank sheet of paper, plain table-top, plain piece of clothing, or even the palm of a hand. As our blob detector dynamically estimates the background color range, it can cope with various backgrounds and lighting conditions; it is not limited to a lab environment. Create a new file, BlobDetector.cpp, for the implementation of our BlobDetector class. (To review the header, refer back to the Defining blobs and a blob detector section.) At the top of BlobDetector.cpp, we will define several constants that pertain to the breadth of the background color range, the size and smoothing of the blobs, and the color of the blobs' rectangles in the preview image. Here is the relevant code: #include <opencv2/imgproc.hpp>   #include "BlobDetector.h"   const double MASK_STD_DEVS_FROM_MEAN = 1.0; const double MASK_EROSION_KERNEL_RELATIVE_SIZE_IN_IMAGE = 0.005; const int MASK_NUM_EROSION_ITERATIONS = 8;   const double BLOB_RELATIVE_MIN_SIZE_IN_IMAGE = 0.05;   const cv::Scalar DRAW_RECT_COLOR(0, 255, 0); // Green Of course, the heart of BlobDetector is its detect method. Optionally, the method creates a downsized version of the image for faster processing. Then, we call a helper method, createMask, to perform thresholding and erosion on the (resized) image. We pass the resulting mask to the cv::Canny function to perform Canny edge detection. We pass the edge mask to the cv::findContours function, which populates a vector of contours, in the vector<vector<cv::Point>> format. That is to say, each contour is a vector of points. For each contour, we find the points' bounding rectangle. If we are working with a resized image, we restore the bounding rectangle to the original scale. We reject rectangles that are very small. Finally, for each accepted rectangle, we put a new Blob object in the output vector and optionally draw the rectangle in the original image. Here is the detect method's implementation: void BlobDetector::detect(cv::Mat &image,   std::vector<Blob>&blobs, double resizeFactor, bool draw) {   blobs.clear();     if (resizeFactor == 1.0) {     createMask(image);   } else {     cv::resize(image, resizedImage, cv::Size(), resizeFactor,       resizeFactor, cv::INTER_AREA);     createMask(resizedImage);   }     // Find the edges in the mask.   cv::Canny(mask, edges, 191, 255);     // Find the contours of the edges.   cv::findContours(edges, contours, hierarchy, cv::RETR_TREE,     cv::CHAIN_APPROX_SIMPLE);     std::vector<cv::Rect> rects;   int blobMinSize = (int)(MIN(image.rows, image.cols) *     BLOB_RELATIVE_MIN_SIZE_IN_IMAGE);   for (std::vector<cv::Point> contour : contours) {       // Find the contour's bounding rectangle.     cv::Rect rect = cv::boundingRect(contour);       // Restore the bounding rectangle to the original scale.     rect.x /= resizeFactor;     rect.y /= resizeFactor;     rect.width /= resizeFactor;     rect.height /= resizeFactor;       if (rect.width < blobMinSize || rect.height < blobMinSize) {       continue;     }       // Create the blob from the sub-image inside the bounding     // rectangle.     blobs.push_back(Blob(cv::Mat(image, rect)));       // Remember the bounding rectangle in order to draw it later.     rects.push_back(rect);   }     if (draw) {     // Draw the bounding rectangles.     for (const cv::Rect &rect : rects) {       cv::rectangle(image, rect.tl(), rect.br(), DRAW_RECT_COLOR);     }   } } The getMask method simply returns the mask that we previously computed in the detect method: const cv::Mat &BlobDetector::getMask() const {   return mask; } The createMask helper method begins by finding the image's mean color and standard deviation using the cv::meanStdDev function. We calculate a range of background colors based on a certain number of standard deviations from the mean, as defined by the MASK_STD_DEVS_FROM_MEAN constant near the top of BlobDetector.cpp. We deem values outside this range to be foreground colors. Using the cv::inRange function, we map the background colors (in the image) to white (in the mask) and the foreground colors (in the image) to black (in the mask). Then, we create a square kernel using the cv::getStructuringElement function. Finally, we use the kernel in the cv::erode function to apply the erosion morphological operation to the mask. This has the effect of smoothing the black (foreground) regions such that they swallow up little gaps that are probably just noise. Here is the relevant code: void BlobDetector::createMask(const cv::Mat &image) {     // Find the image's mean color.   // Presumably, this is the background color.   // Also find the standard deviation.   cv::Scalar meanColor;   cv::Scalar stdDevColor;   cv::meanStdDev(image, meanColor, stdDevColor);     // Create a mask based on a range around the mean color.   cv::Scalar halfRange = MASK_STD_DEVS_FROM_MEAN * stdDevColor;   cv::Scalar lowerBound = meanColor - halfRange;   cv::Scalar upperBound = meanColor + halfRange;   cv::inRange(image, lowerBound, upperBound, mask);     // Erode the mask to merge neighboring blobs.   int kernelWidth = (int)(MIN(image.cols, image.rows) *     MASK_EROSION_KERNEL_RELATIVE_SIZE_IN_IMAGE);   if (kernelWidth > 0) {     cv::Size kernelSize(kernelWidth, kernelWidth);     cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT,       kernelSize);     cv::erode(mask, mask, kernel, cv::Point(-1, -1),       MASK_NUM_EROSION_ITERATIONS);   } } That is the end of the blob detector's code. As you can see, it uses a general-purpose and rather linear approach, without any special cases for different kinds of objects.Moreover, we are using a separate blob detector and blob classifier, and this separation of responsibilities enables us to keep each class's implementation relatively simple. For completeness, note that the Blob class's constructors have straightforward implementations that copy the arguments. For the blob's image, we make a deep copy because the original may change. (For example, the original may be a subimage in a frame of video, and after detection we may draw rectangles atop the frame of video.) Similarly, Blob's getter and setter methods are self-explanatory. Create a new file, Blop.cpp, and fill it with the following implementation: #import "Blob.h"   Blob::Blob(const cv::Mat &mat, uint32_t label) : label(label) {   mat.copyTo(this->mat); }   Blob::Blob() { }   Blob::Blob(const Blob &other) : label(other.label) {   other.mat.copyTo(mat); }   bool Blob::isEmpty() const {   return mat.empty(); } uint32_t Blob::getLabel() const {   return label; } void Blob::setLabel(uint32_t value) {   label = value; } const cv::Mat &Blob::getMat() const {   return mat; } int Blob::getWidth() const {   return mat.cols; } int Blob::getHeight() const {   return mat.rows; } Classifying blobs by color and keypoints Our classifier operates on the assumption that a blob contains distinctive colors, distinctive keypoints, or both. To conserve memory and precompute as much relevant information as possible, we do not store images of the reference blobs, but instead we store histograms and keypoint descriptors. Create a new file, BlobClassifier.cpp, for the implementation of our BlobClassifier class. (To review the header, refer back to the Defining blob descriptors and a blob classifier section.) At the top of BlobDetector.cpp, we will define several constants that pertain to the number of histogram bins, the histogram comparison method, and the relative importance of the histogram comparison versus the keypoint comparison. Here is the relevant code: #include <opencv2/imgproc.hpp>   #include "BlobClassifier.h"   #ifdef WITH_OPENCV_CONTRIB #include <opencv2/xfeatures2d.hpp> #endif   const int HISTOGRAM_NUM_BINS_PER_CHANNEL = 32; const int HISTOGRAM_COMPARISON_METHOD = cv::HISTCMP_CHISQR_ALT;   const float HISTOGRAM_DISTANCE_WEIGHT = 0.98f; const float KEYPOINT_MATCHING_DISTANCE_WEIGHT = 1.0f -   HISTOGRAM_DISTANCE_WEIGHT; Beware that the HISTOGRAM_NUM_BINS_PER_CHANNEL constant has a cubic relationship to memory usage. For each blob descriptor, we store a three-dimensional (BGR) histogram with HISTOGRAM_NUM_BINS_PER_CHANNEL^3 elements, and each element is a 32-bit floating point number. If the constant is 32, each histogram's size in megabytes is (32^3)*32/(10^6)=1.0. This is fine for a small set of reference descriptors. If the constant is 256 (the maximum number of bins for an 8-bit color channel), the histogram's size goes up to a whopping value of (256^3)*32/(10^6)=536.9 megabytes! For an iOS application, this is unacceptable, given the platform's memory constraints. At best, in a high-end iOS device, one gigabyte of RAM might be available to each application. Conservatively, you should worry if your app's memory usage approaches 100 megabytes. Remember that OpenCV's SURF implementation is in the xfeatures2d module, which is part of opencv_contrib. If opencv_contrib is available, let's define the WITH_OPENCV_CONTRIB preprocessor flag. Then, our code imports the <opencv/xfeatures2d.hpp> header, and we use SURF. Otherwise, we use ORB. This selection also affects the implementation of BlobClassifier's constructor. OpenCV provides factory methods for various feature detectors, descriptors, and matchers, so we simply have to use the right combination of factory methods for SURF with Flann matching or ORB with brute-force matching based on the Hamming distance. Here is the constructor's implementation: BlobClassifier::BlobClassifier() { #ifdef WITH_OPENCV_CONTRIB   featureDetectorAndDescriptorExtractor =     cv::xfeatures2d::SURF::create();   descriptorMatcher = cv::DescriptorMatcher::create("FlannBased"); #else   featureDetectorAndDescriptorExtractor = cv::ORB::create();   descriptorMatcher = cv::DescriptorMatcher::create( "BruteForce-HammingLUT"); #endif } The update method's implementation calls a helper method, createBlobDescriptor, and adds the resulting BlobDescriptor to a vector of reference descriptors: void BlobClassifier::update(const Blob &referenceBlob) {   referenceBlobDescriptors.push_back(     createBlobDescriptor(referenceBlob)); } The clear method's implementation discards all the reference descriptors such that the BlobClassifier reverts to its initial, untrained state: void BlobClassifier::clear() {   referenceBlobDescriptors.clear(); } The implementation of the classify method relies on another helper method, findDistance. For each reference descriptor, classify calls findDistance to obtain a measure of dissimilarity between the query blob's descriptor and reference descriptor. We find the reference descriptor with the least distance (best similarity) and return its label as the classification result. If there are no reference descriptors, classify returns 0, the "unknown" label. Here is classify's implementation: void BlobClassifier::classify(Blob &detectedBlob) const {   BlobDescriptor detectedBlobDescriptor =     createBlobDescriptor(detectedBlob);   float bestDistance = FLT_MAX;   uint32_t bestLabel = 0;   for (const BlobDescriptor &referenceBlobDescriptor :       referenceBlobDescriptors) {     float distance = findDistance(detectedBlobDescriptor,       referenceBlobDescriptor);     if (distance < bestDistance) {       bestDistance = distance;       bestLabel = referenceBlobDescriptor.getLabel();     }   }   detectedBlob.setLabel(bestLabel); } The createBlobDescriptor helper method is responsible for calculating a normalized histogram of Bloband keypoint descriptors and using them to build a new BlobDescriptor. To calculate the (non-normalized) histogram, we use the cv::calcHist function. Among its arguments, it requires three arrays to specify the channels we want to use, the number of bins per channel, and the range of each channel's values. To normalize the resulting histogram, we divide by the number of pixels in the blob's image. The following code, pertaining to the histogram, is the first half of implementation of createBlobDescriptor: BlobDescriptor BlobClassifier::createBlobDescriptor(   const Blob &blob) const {    const cv::Mat &mat = blob.getMat();   int numChannels = mat.channels();     // Calculate the histogram of the blob's image.   cv::Mat histogram;   int channels[] = { 0, 1, 2 };   int numBins[] = { HISTOGRAM_NUM_BINS_PER_CHANNEL,     HISTOGRAM_NUM_BINS_PER_CHANNEL,     HISTOGRAM_NUM_BINS_PER_CHANNEL };   float range[] = { 0.0f, 256.0f };   const float *ranges[] = { range, range, range };   cv::calcHist(&mat, 1, channels, cv::Mat(), histogram, 3,     numBins, ranges);     // Normalize the histogram.   histogram *= (1.0f / (mat.rows * mat.cols)); Now, we must convert the blob's image to grayscale and obtain keypoints and keypoint descriptors using the detect and compute methods of cv::Feature2D. With the normalized histogram and keypoint descriptors, we have everything that we need to construct and return a new BlobDescriptor. Here is the remainder of implementation of createBlobDescriptor: // Convert the blob's image to grayscale.   cv::Mat grayMat;   switch (numChannels) {     case 4:       cv::cvtColor(mat, grayMat, cv::COLOR_BGRA2GRAY);       break;     default:       cv::cvtColor(mat, grayMat, cv::COLOR_BGR2GRAY);       break;   }     // Detect features in the grayscale image.   std::vector<cv::KeyPoint> keypoints;   featureDetectorAndDescriptorExtractor->detect(grayMat,     keypoints);     // Extract descriptors of the features.   cv::Mat keypointDescriptors;   featureDetectorAndDescriptorExtractor->compute(grayMat,     keypoints, keypointDescriptors);     return BlobDescriptor(histogram, keypointDescriptors,     blob.getLabel()); } The findDistance helper method performs histogram comparison using the cv::compareHist function and keypoint matching using the match method of cv::DescriptorMatcher. Each of the resulting keypoint matches has a distance, and we sum these distances. Then, as an overall measure of distance between the two blob descriptors, we return a weighted average of the histogram distance and the total keypoint matching distance. Here is the relevant code: float BlobClassifier::findDistance(   const BlobDescriptor &detectedBlobDescriptor,   const BlobDescriptor &referenceBlobDescriptor) const {    // Calculate the histogram distance.   float histogramDistance = (float)cv::compareHist(     detectedBlobDescriptor.getNormalizedHistogram(),     referenceBlobDescriptor.getNormalizedHistogram(),     HISTOGRAM_COMPARISON_METHOD);     // Calculate the keypoint matching distance.   float keypointMatchingDistance = 0.0f;   std::vector<cv::DMatch> keypointMatches;   descriptorMatcher->match(     detectedBlobDescriptor.getKeypointDescriptors(),     referenceBlobDescriptor.getKeypointDescriptors(),     keypointMatches);   for (const cv::DMatch &keypointMatch : keypointMatches) {     keypointMatchingDistance += keypointMatch.distance;   }     return histogramDistance * HISTOGRAM_DISTANCE_WEIGHT +     keypointMatchingDistance * KEYPOINT_MATCHING_DISTANCE_WEIGHT; } That is the end of the blob classifier's code. Again, we see that a single class can provide useful, general-purpose computer vision functionality without a terribly complicated implementation. Perhaps this is a Zen moment; our previous work and studieshave been a path to (some kind of) simplicity! Of course, OpenCV hides a lot of complexity for us in its implementations of histogram-related functions and keypoint-related classes, and in this way, the library offers us a relatively gentle path. For completeness, note that the BlobDescriptor class has a straightforward implementation. Create a new file, BlobDescriptor.cpp, and fill it with the following bodies for a constructor and getters: #include "BlobDescriptor.h"   BlobDescriptor::BlobDescriptor(const cv::Mat &normalizedHistogram, const cv::Mat &keypointDescriptors, uint32_t label) : normalizedHistogram(normalizedHistogram) , keypointDescriptors(keypointDescriptors) , label(label) { }   const cv::Mat &BlobDescriptor::getNormalizedHistogram() const {   return normalizedHistogram; } const cv::Mat &BlobDescriptor::getKeypointDescriptors() const {   return keypointDescriptors; } uint32_t BlobDescriptor::getLabel() const {   return label; } Summary Now, we have finished all the code for the detector, descriptor, and classifier! Again, for more information, refer to Chapter 5, Classifying Coins and Commodities in the book,iOS Application Development with OpenCV 3. Resources for Article: Further resources on this subject: Making subtle color shifts with curves [article] New functionality in OpenCV 3.0 [article] Camera Calibration [article]
Read more
  • 0
  • 0
  • 9100
article-image-incident-response-and-live-analysis
Packt
10 Jun 2016
30 min read
Save for later

Incident Response and Live Analysis

Packt
10 Jun 2016
30 min read
In this article by Ayman Shaaban and Konstantin Sapronov, author of the book Practical Windows Forensics, describe the stages of preparation to respond to an incident are a matter which much attention should be paid to. In some cases, the lack of necessary tools during the incident leads to the inability to perform the necessary actions at the right time. Taking into account that the reaction time of an incident depends on the efficiency of the incident handling process, it becomes clear that in order to prepare the IR team, its technical support should be very careful. The whole set of requirements can be divided into several categories for the IR team: Skills Hardware Software (For more resources related to this topic, see here.) Let's consider the main issues that may arise during the preparation of the incident response team in more detail. If we want to build a computer security incident response team, we need people with a certain set of skills and technical expertise to perform technical tasks and effectively communicate with other external contacts. Now, we will consider the skills of members of the team. The set of skills that members of the team need to have can be divided into two groups: Personal skills Technical skills Personal skills Personal skills are very important for a successful response team. This is because the interaction with team members who are technical experts but have poor social skills can lead to misunderstanding and misinterpretation of the results, the consequences of which may affect the team's reputation. A list of key personal skills will be discussed in the following sections. Written communication For many IR teams, a large part of their communication occurs through written documents. These communications can take many forms, including e-mails concerning incidents documentation of event or incident reports, vulnerabilities, and other technical information notifications. Incident response team members must be able to write clearly and concisely, describe activities accurately, and provide information that is easy for their readers to understand. Oral communication The ability to communicate effectively though spoken communication is also an important skill to ensure that the incident response team members say the right words to the right people. Presentation skills Not all technical experts have good presentation skills. They may not be comfortable in front of a large audience. Gaining confidence in presentation skills will take time and effort for the team's members to become more experienced and comfortable in such situations. Diplomacy The members of the incident response team interact with people who may have a variety of goals and needs. Skilled incident response team members will be able to anticipate potential points of contention, be able to respond appropriately, maintain good relationships, and avoid offending others. They also will understand that they are representing the IR team and their organization. Diplomacy and tact are very important. The ability to follow policies and procedures Another important skill that members of the team need is the ability to follow and support the established policies and procedures of the organization or team. Team skills IR staff must be able to work in the team environment as productive and cordial team players. They need to be aware of their responsibilities, contribute to the goals of the team, and work together to share information, workload, and experiences. They must be flexible and willing to adapt to change. They also need skills to interact with other parties. Integrity The nature of IR work means that team members often deal with information that is sensitive and, occasionally, they might have access to information that is newsworthy. The team's members must be trustworthy, discrete, and able to handle information in confidence according to the guidelines, any constituency agreements or regulations, and/or any organizational policies and procedures. In their efforts to provide technical explanations or responses, the IR staff must be careful to provide appropriate and accurate information while avoiding the dissemination of any confidential information that could detrimentally affect another organization's reputation, result in the loss of the IR team's integrity, or affect other activities that involve other parties. Knowing one's limits Another important ability that the IR team's members must have is the ability to be able to readily admit when they have reached the limit of their own knowledge or expertise in a given area. However difficult it is to admit a limitation, individuals must recognize their limitations and actively seek support from their team members, other experts, or their management. Coping with stress The IR team's members often could be in stressful situations. They need to be able to recognize when they are becoming stressed, be willing to make their fellow team members aware of the situation, and take (or seek help with) the necessary steps to control and maintain their composure. In particular, they need the ability to remain calm in tense situations—ranging from an excessive workload to an aggressive caller to an incident where human life or a critical infrastructure may be at risk. The team's reputation, and the individual's personal reputation, will be enhanced or will suffer depending on how such situations are handled. Problem solving IR team members are confronted with data every day, and sometimes, the volume of information is large. Without good problem-solving skills, staff members could become overwhelmed with the volumes of data that are related to incidents and other tasks that need to be handled. Problem-solving skills also include the ability for the IR team's members to "think outside the box" or look at issues from multiple perspectives to identify relevant information or data. Time management Along with problem-solving skills, it is also important for the IR team's members to be able to manage their time effectively. They will be confronted with a multitude of tasks ranging from analyzing, coordinating, and responding to incidents, to performing duties, such as prioritizing their workload, attending and/or preparing for meetings, completing time sheets, collecting statistics, conducting research, giving briefings and presentations, traveling to conferences, and possibly providing on-site technical support. Technical skills Another important component of the skills needed for an IR team to be effective is the technical skills of their staff. These skills, which define the depth and breadth of understanding of the technologies that are used by the team, and the constituency it serves, are outlined in the following sections. In turn, the technical skills, which the IR team members should have, can be divided into two groups: security fundamentals and incident handling skills. Security fundamentals Let's look at some of the security fundamentals in the following subsections. Security principles The IR team's members need to have a general understanding of the basic security principles, such as the following: Confidentiality Availability Authentication Integrity Access control Privacy Nonrepudiation Security vulnerabilities and weaknesses To understand how any specific attack is manifested in a given software or hardware technology, the IR team's members need to be able to first understand the fundamental causes of vulnerabilities through which most attacks are exploited. They need to be able to recognize and categorize the most common types of vulnerabilities and associated attacks, such as those that might involve the following: Physical security issues Protocol design flaws (for example, man-in-the-middle attacks or spoofing) Malicious code (for example, viruses, worms, or Trojan horses) Implementation flaws (for example, buffer overflow or timing windows/race conditions) Configuration weaknesses User errors or indifference The Internet It is important that the IR team's members also understand the Internet. Without this fundamental background information, they will struggle or fail to understand other technical issues, such as the lack of security in underlying protocols and services that are used on the Internet or to anticipate the threats that might occur in the future. Risks The IR team's members need to have a basic understanding of computer security risk analysis. They should understand the effects on their constituency of various types of risks (such as potentially widespread Internet attacks, national security issues as they relate to their team and constituency, physical threats, financial threats, loss of business, reputation, or customer confidence, and damage or loss of data). Network protocols Members of the IR team need to have a basic understanding of the common (or core) network protocols that are used by the team and the constituency that they serve. For each protocol, they should have a basic understanding of the protocol, its specifications, and how it is used. In addition to this, they should understand the common types of threats or attacks against the protocol, as well as strategies to mitigate or eliminate such attacks. For example, at a minimum, the staff should be familiar with protocols, such as IP, TCP, UDP, ICMP, ARP, and RARP. They should understand how these protocols work, what they are used for, the differences between them, some of the common weaknesses, and so on. In addition to this, the staff should have a similar understanding of protocols, such as TFTP, FTP, HTTP, HTTPS, SNMP, SMTP, and any other protocols. The specialist skills include a more in-depth understanding of security concepts and principles in all the preceding areas in addition to expert knowledge in the mechanisms and technologies that lead to flaws in these protocols, the weaknesses that can be exploited (and why), the types of exploitation methods that would likely be used, and the strategies to mitigate or eliminate these potential problems. They should have expert understanding of additional protocols or Internet technologies (DNSSEC, IPv6, IPSEC, and other telecommunication standards that might be implemented or interface with their constituent's networks, such as ATM, BGP, broadband, voice over IP, wireless technology, other routing protocols, or new emerging technologies, and so on). They could then provide expert technical guidance to other members of the team or constituency. Network applications and services The IR team's staff need a basic understanding of the common network applications and services that the team and the constituency use (DNS, NFS, SSH, and so on). For each application or service they should understand the purpose of the application or service, how it works, its common usages, secure configurations, and the common types of threats or attacks against the application or service, as well as mitigation strategies. Network security issues The members of the IR team should have a basic understanding of the concepts of network security and be able to recognize vulnerable points in network configurations. They should understand the concepts and basic perimeter security of network firewalls (design, packet filtering, proxy systems, DMZ, bastion hosts, and so on), router security, the potential for information disclosure of data traveling across the network (for example, packet monitoring or "sniffers"), or threats that are related to accepting untrustworthy information. Host or system security issues In addition to understanding security issues at a network level, the IR team's members need to understand security issues at a host level for the various types of operating systems (UNIX, Windows, or any other operating systems that are used by the team or constituency). Before understanding the security aspects, the IR team's member must first have the following: Experience using the operating system (user security issues) Some familiarity with managing and maintaining the operating system (as an administrator) Then, for each operating system, the IR team member needs to know how to perform the following: Configure (harden) the system securely Review configuration files for security weaknesses Identify common attack methods Determine whether a compromise attempt occurred Determine whether an attempted system compromise was successful Review log files for anomalies Analyze the results of attacks Manage system privileges Secure network daemons Recover from a compromise Malicious code The IR team's members must understand the different types of malicious code attacks that occur and how these can affect their constituency (system compromises, denial of service, loss of data integrity, and so on). Malicious code can have different types of payloads that can cause a denial of service attack or web defacement, or the code can contain more "dynamic" payloads that can be configured to result in multifaceted attack vectors. Staff should understand not only how malicious code is propagated through some of the obvious methods (disks, e-mail, programs, and so on), but they should also understand how it can propagate through other means, such as PostScript, Word macros, MIME, peer-to-peer file sharing, or boot-sector viruses that affect operating systems running on PC and Macintosh platforms. The IR team's staff must be aware of how such attacks occur and are propagated, the risks and damage associated with such attacks, prevention and mitigation strategies, detection and removal processes, and recovery techniques. Specialist skills include expertise in performing analysis, black box testing, reverse engineering malicious code that is associated with such attacks, and in providing advice to the team on the best approaches for an effective response. Programming skills Some team members need to have system and network programming experience. The team should ensure that a range of programming languages is covered on the operating systems that the team and the constituency use. For example, the team should have experience in the following: C Python Awk Java Shell (all variations) Other scripting tools These scripts or programming tools can be used to assist in the analysis and handling of incident information (for example, writing different scripts to count and sort through various logs, search databases, look up information, extract information from logs or files, and collect and merge data). Incident handling skills Local team policies and protocols Understanding and identifying intruder techniques Communication with sites Incident analysis Maintenance of incident records The hardware for IR and Jump Bag Certainly, a set of equipment that may be required during the processing of the incident should be prepared in advance, and this matter should be given much attention. This set is called the Jump Bag. The formation of such a kit is largely due to the budget the organization could afford. Nevertheless, there is a certain necessary minimum, which will allow the team to handle incidents in small quantities. If the budget allows it, it is possible to buy a turnkey solution, which includes all the necessary equipment and the case for its transportation. As an instance of such a solution, FREDL + Ultra Kit could be recommended. FREDL is short for Forensic Recovery of Evidence Device Laptop. With Ultra Kit, this solution will cost about 5000 USD. Ultra Kit contains a set of write-blockers and a set of adapters and connecters to obtain images of hard drives with a different interface: More details can be found on the manufacturer's website at https://www.digitalintelligence.com/products/ultrakit/. Certainly, if we ignore the main drawback of such a solution, this decision has a lot of advantages as compared to the cost. Besides this, you get a complete starter kit to handle the incident. Besides, Ultra Kit allows you to safely transport equipment without fear of damage. The FRED-L laptop is based on a modern hardware, and the specifications are constantly updated to meet modern requirements. Current specifications can be found on the manufacturer's website at http://www.digitalintelligence.com/products/fredl/. However, if you want to replace the expensive solution, you could build a cheaper alternative that will save 20-30% of the budget. It is possible to buy the components included in the review of decisions separately. As a workstation, you can choose a laptop with the following specifications: Intel Core i7-6700K Skylake Quad Core Processor, 4.0 GHz, 8MB Intel Smart Cache 16 GB PC4-17000 DDR4 2133 Memory 256 GB Solid State Internal SATA Drive Intel Z170 Express Chipset NVIDIA GeForce GTX 970M with 6 GB GDDR5 VRAM This specification will provide a comfortable workstation to work on the road. As a case study for the transport of the equipment, we recommend paying attention to Pelican (http://www.pelican.com) cases. In this case, the manufacturer can choose the equipment to meet your needs. One of the typical tasks in handling of incidents is obtaining images from hard drives. For this task, you can use a duplicator or a bunch of write-blockers and computer. Duplicators are certainly a more convenient solution; their usage allows you to quickly get the disk image without using additional software. Their main drawback is the price. However, if you often have to extract the image of hard drives and you have a few thousand dollars, the purchase of the duplicator is a good investment. If the imaging of hard drives is a relatively rare problem and you have a limited budget, you can purchase a write blocker which will cost 300-500 USD. However, it is necessary to use a computer and software. To pick up the necessary equipment, you can visit http://www.insectraforensics.com, where you can find equipment from different manufacturers. Also, do not forget about the hard drives themselves. It is worth buying a few hard drives with large volumes for the possibility of good performance. To summarize, responders need to include the following items in a basic set: Several network cables (straight through or loopback) A serial cable with a serial USB adapter Network serial adapters Hard drives (various sizes) Flash drives A Linux Live DVD A portable drive duplicator with a write-blocker Various drive interface adapters A four port hub A digital camera Cable ties Cable snips Assorted screws and hex drivers Notebooks and pens Chain of Custody forms Incident handling procedure Software After talking about the hardware, we did not forget about the software that you should always have on hand. The variety of software that can be used in the processing of the incident allows you to select software-based preferences, skills, and budget. Some prefer command-line utilities, and some find that GUI is more convenient to use. Sometimes, the use of certain tools is dictated by the circumstances under which it's needed to work. We strongly recommend that you prepare these in advance and thoroughly test the entire set of required software. Live versus mortem The initial reaction to an incident is a very important step in the process of computer incident management. The correct method of carrying out and performing this step depends on the success of the investigation. Moreover, a correct and timely response is needed to reduce the damage caused by the incident. The traditional approach to the analysis of the disks is not always practical, and in some cases, it is simply not possible. In today's world, the development of computer technology has led to many companies having a distribution network in many cities, countries, and continents. Wish this physical disconnection of the computer from the network, following the traditional investigation of each computer is not possible. In such cases, the incident responder should be able to carry out a prior assessment remotely and as soon as possible, view a list of running processes, open network connections, open files, and get a list of registered users in the system. Then, if necessary, carry out a full investigation. In this article, we will look at some approaches that the responder may apply in a given situation. However, even in these cases when we have physical access to the machine, live response is the only way of incident response. For example, cases where we are dealing with large disk arrays. In this case, there are several problems at once. The first problem is that the space to store large amounts of data is also difficult to identify. In addition to this, the time that may be required to analyze large amounts of data is unreasonably high. Typically, such large volumes of data have a highly loaded server serving hundreds of thousands of users, so their trip, or even a reboot, is not acceptable for business. Another scenario that requires the Live Forensics approach is when an encrypted filesystem is used. In cases where the analyst doesn't have the key to decrypt the disc, Live Forensics is a good alternative to obtain data from a system where encryption of the filesystem is used. This is not an exhaustive list of cases when the Live Analysis could be applicable. It is worth noting one very important point. During the Live Analysis, it is not possible to avoid changes in the system. Connecting external USB devices or network connectivity, user log on, or launching an executable file will be modified in the system in a variety of log files, registry keys, and so on. Therefore, you need to understand what changes were caused by the actions of responders and document them. Volatile data Under the principle of "order of Volatility", you must first collect information that is classified as Volatile Data (the list of network connections, the list of running processes, log on sessions, and so on), which will be irretrievably lost in case the computer is powered off. Then, you can start to collect nonvolatile data, which can also be obtained with the traditional approach in the analysis of the disk image. The main difference in this case is that a Live Forensics set of data is easier to obtain with a working machine. This article will focus on the collection of Volatile data. Typically, this category includes the following data: System uptime and the current time Network parameters (NetBIOS name cache, active connections, the routing table, and so on). NIC configuration settings Logged on users and active sessions Loaded drivers Running services Running processes and their related parameters (loaded DLLs, open handles, and ownership) Autostart modules Shared drives and files opened remotely Recording the time and date of the data collection allows you to define a time interval in which the investigator will perform an analysis of the system: (date / t) & (time / t)>%COMPUTER_NAME% systime.txt systeminfo | find "Boot Time" >>% COMPUTERNAME% systime.txt The last command allows you to< show how long the machine worked since the last reboot. Using the %COMPUTERNAME% environment variable, we can set up separate directories for each machine in case we need to repeat the process of collecting information on different computers in a network. In some cases, signs of compromise are clearly visible in the analysis of network activity. The next set of commands allows you to get this information: nbtstat -c> %COMPUTERNAME%NetNameCache.txt netstat -a -n -o>%COMPUTERNAME%NetStat.txt netstat -rn>%COMPUTNAME%NetRoute.txt ipconfig / all>%COMPUTERNAME%NIC.txt promqry>%COMPUTERNAME%NSniff.txt The first command uses nbtstat.exe to obtain information from the cache of NetBIOS. You display the NetBIOS names in their corresponding IP address. The second and third commands use netstat.exe to record all of the active compounds, listening ports, and routing tables. For information about network settings, the ipconfig.exe network interfaces command is used. The last block command starts the Microsoft promqry utility, which allows you to define the network interfaces on the local machine, which operates in promiscuous mode. This mode is required for network sniffers, so the detection of the regime indicates that the computer can run software that listens to network traffic. To enumerate all the logged on users on the computer, you can use the Sysinternals tools: psloggedon -x>%COMPUTERNAME% LoggedUsers.tx: logonsessions -p >> %COMPUTERNAME%LoggedOnUsers.txt The PsLoggedOn.exe command lists both types of users, those who are logged on to the computer locally, and those who logged on remotely over the network. Using the -x switch, you can get the time at which each user logged on. With the -p key, logonsessions will display all of the processes that were started by the user during the session. It should be noted that logonsessions must be run with administrator privileges. To get a list of all drivers that are loaded into the system, you can use the WDK drivers.exe utility: drivers.exe>%COMPUTERNAME%drivers.txt The next set of commands to obtain a list of running processes and related information is as follows: tasklist / svc>%COMPUTERNAME% taskdserv.txt psservice>%COMPUTERNAME% trasklst.txt tasklist / v>%COMPUTERNAME% taskuserinfo.txt pslist / t>%COMPUTERNAME%tasktree.txt listdlls>%COMPUTERNAME%lstdlls.txt handle -a>%COMPUTERNAME%lsthandles.txt The tasklist.exe utility that is made with the / svc key enumerates the list of running processes and services in their context. While the previous command displays a list of running services, PsService receives information on services using the information in the registry and SCM database. Services are a traditional way through which attackers can access a previously compromised system. Services can be configured to run automatically without user intervention, and they can be launched as part of another process, such as svchost.exe. In addition to this, remote access can be provided through completely legitimate services, such as telnet or ftp. To associate users with their running processes, use the tasklist / v command key. To enumerate a list of DLLs loaded in each process and the full path to the DLL, you can use listsdlls.exe from SysInternals. Another handle.exe utility can be used to list all the handles, which are open processes. This handles registry keys, files, ports, mutexes, and so on. Other utilities require run with administrator privileges. These tools can help identify malicious DLLs that were injected into the processes, as well as files, which have not been accessed by these processes. The next group of commands allows you to get a list of programs that are configured to start automatically: autorunsc.exe -a>%COMPUTERNAME% autoruns.txt at>%COMPUTERNAME% at.txt schtasks / query>%COMPUTERNAME% schtask.txt The first command starts the SysInternals utility, autoruns, and displays a list of executables that run at system startup and when users log on. This utility allows you to detect malware that uses the popular and well-known methods for persistent installation into the system. Two other commands (at and schtasks) display a list of commands that run in the schedule. To start the at command also requires administrator privileges. To install backdoors mechanisms, services are often used, but services are constantly working in the system and, thus, can be easily detected during live response. Thus, create a backdoor that runs on a schedule to avoid detection. For example, an attacker could create a task that will run the malware just outside working hours. To get a list of network share drives and disk files that are deleted, you can use the following two commands: psfile>%COMPUTERNAME%openfileremote.txt net share>%COMPUTERNAME%drives.txt Nonvolatile data After Volatile data has been collected, you can continue to collect Nonvolatile Data. This data can be obtained at the stage of analyzing the disk, but as we mentioned earlier, analysis of the disk is not possible in some cases. This data includes the following: The list of installed software and updates User info Metadata about a filesystem's timestamps Registry data However, upon receipt of this data with the live running of the system, there are difficulties that are associated with the fact that many of these files cannot be copied in the usual way, as they are locked by the operating system. To do this, use one of the utilities. One such utility is the RawCopy.exe utility, which is authored by Joakim Schicht. This is a console application that copies files off NTFS volumes using the low-level disk reading method. The application has two mandatory parameters, target file and output path: -param1: This is the full path to the target file to extract; it also supports IndexNumber instead of file path -param2: This is a valid path to output directory This tool will let you copy files that are usually not accessible because the system has locked them. For instance, the registry hives such as SYSTEM and SAM, files inside SYSTEM VOLUME INFORMATION, or any file on the volume. This supports the input file specified either with the full file path or by its $MFT record number (index number). Here's an example of copying the SYSTEM hive off a running system: RawCopy.exe C:WINDOWSsystem32configSYSTEM %COMPUTERNAME%SYSTEM Here's an example of extracting the $MFT by specifying its index number: RawCopy.exe C:0 %COMPUTERNAME%mft Here's an example of extracting the MFT reference number 30224 and all attributes, including $DATA, and dumping it into C:tmp: RawCopy.exe C:30224 C:tmp -AllAttr To download RawCopy, go to https://github.com/jschicht/RawCopy. Knowing what software is installed and what its updates are helps further the investigation because this shows possible ways to compromise a system through a vulnerability in the software. One of the first actions that the attacker makes is to attack during a system scan to detect active services and exploit the vulnerabilities in them. Thus, services that were not patched can be utilized for remote system penetration. One way to install a set of software and updates is to use the systeminfo utility: systeminfo > %COMPUTERNAME%sysinfo.txt. Moreover, skilled attackers can themselves perform the same actions and install necessary updates in order to hide the traces of penetration into the system. After identifying the vulnerable services and their successful exploits, the attacker creates an account for themselves in order to subsequently use legal ways to enter the system. Therefore, the analysis of data about users of the system reveals the following traces of the compromise: The Recent folder contents, including LNK files and jump lists LNK files in the Office Recent folder The Network Recent folder contents The entire temp folder The entire Temporary Internet Files folder The PrivacyIE folder The Cookies folder The Java Cache folder contents Now, let's consider the preceding cases as follows: Collecting the Recent folder is done as follows: robocopy.exe %RECENT% %COMPUTERNAME%Recent /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%Recent log.txt Here %RECENT% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %RECENT% = %systemdrive%Documents and Settings%USERNAME%Recent For Windows 6.x (Windows Vista and newer): %RECENT% =%systemdrive%Users%USERNAME%AppDataRoaming MicrosoftWindowsRecent Collecting the Office Recent folder is done as follows: robocopy.exe %RECENT_OFFICE% %COMPUTERNAME%Recent_Office /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%Recent_Officelog.txt Here %RECENT_OFFICE% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %RECENT_OFFICE% = %systemdrive%Documents and Settings%USERNAME%Application DataMicrosoftOffice Recent For Windows 6.x (Windows Vista and newer), this is as follows: %RECENT% =%systemdrive%Users%USERNAME%AppDataRoaming MicrosoftWindowsOfficeRecent Collecting the Network Shares Recent folder is done as follows: robocopy.exe %NetShares% %COMPUTERNAME%NetShares /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%NetShareslog.txt Here %NetShares% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %NetShares% = %systemdrive%Documents andSettings%USERNAME%Nethood For Windows 6.x (Windows Vista and newer), this is as follows: %NetShares % =''%systemdrive%Users%USERNAME%AppData RoamingMicrosoftWindowsNetwork Shortcuts'' Collecting the Temporary folder is done as follows: robocopy.exe %TEMP% %COMPUTERNAME%TEMP /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%TEMPlog.txt Here %TEMP% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %TEMP% = %systemdrive%Documents and Settings%USERNAME% Local SettingsTemp For Windows 6.x (Windows Vista and newer), this is as follows: %TEMP% =''%systemdrive%Users%USERNAME%AppData LocalTemp '' Collecting the Temporary Internet Files folder is done as follows: robocopy.exe %TEMP_INTERNET_FILES% %COMPUTERNAME%TEMP_INTERNET_FILES /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%TEMPlog.txt Here %TEMP_INTERNET_FILE% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %TEMP_INTERNET_FILE% = ''%systemdrive%Documents and Settings%USERNAME%Local SettingsTemporary Internet Files'' For Windows 6.x (Windows Vista and newer), this is as follows: %TEMP_INTERNET_FILE% =''%systemdrive%Users%USERNAME% AppDataLocalMicrosoftWindowsTemporary Internet Files" Collecting the PrivacIE folder is done as follows: robocopy.exe %PRIVACYIE % %COMPUTERNAME%PrivacyIE /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%/PrivacyIE/log.txt Here %PRIVACYIE% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %PRIVACYIE% = ''%systemdrive%Documents andSettings%USERNAME% PrivacIE'' For Windows 6.x (Windows Vista and newer), this is as follows: %PRIVACYIE% =''%systemdrive%Users%USERNAME% AppDataRoamingMicrosoftWindowsPrivacIE " Collecting the Cookies folder is done as follows: robocopy.exe %COOKIES% %COMPUTERNAME%Cookies /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%Cookies .txt Here %COOKIES% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %COOKIES% = ''%systemdrive%Documents and Settings%USERNAME%Cookies'' For Windows 6.x (Windows Vista and newer), this is as follows: %COOKIES% =''%systemdrive%Users%USERNAME% AppDataRoamingMicrosoftWindowsCookies" Collecting the Java Cache folder is done as follows: robocopy.exe %JAVACACHE% %COMPUTERNAME%JAVACACHE /ZB /copy:DAT /r:0 /ts /FP /np /E log:%COMPUTERNAME%JAVACAHElog.txt Here %JAVACACHE% depends on the version of Windows. For Windows 5.x (Windows 2000, Windows XP, and Windows 2003), this is as follows: %JAVACACHE% = ''%systemdrive%Documents and Settings%USERNAME%Application DataSunJavaDeployment cache'' For Windows 6.x (Windows Vista and newer), this is as follows: %JAVACACHE% =''%systemdrive%Users%USERNAME%AppData LocalLowSunJavaDeploymentcache" Remote live response However, as mentioned earlier, it is often necessary to carry out the collection of information remotely. On Windows systems, this is often done using the SysInternals PsExec utility. PsExec lets you execute commands on remote computers and does not require the installation of the system. How the program works is a psexec.exe resource executable is another PsExecs executable. This file runs the Windows service on a particular target machine. Before executing the command, PsExec unpacks this hidden resource in the administrative sphere of the remote computer at Admin$ (C:Windows) file Admin$system32psexecsvc.exe. After copying this, PsExec installs and runs the service using the API functions of the Windows management services. Then, after starting psexesvc, a data connection (input commands and getting results) between psexesvc and psexec is established. Upon completion of the work, psexec stops the service and removes it from the target computer. If the remote collection of information is necessary, a working machine running UNIX OS can use the Winexe utility. Winexe is a GNU/Linux-based application that allows users to execute commands remotely on WindowsNT/2000/XP/2003/Vista/7/8 systems. It installs a service on the remote system, executes the command, and uninstalls the service. Winexe allows execution of most of the Windows shell commands: winexe -U [Domain/]User%Password //host command To launch a Windows shell from inside your Linux system, use the following command: winexe -U HOME/Administrator%Pass123 //192.168.0.1 "cmd.exe" Summary In this article, we discussed what we should have in the Jump Bag to handle a computer incident, and what kind of skills the members of the IR team require. Also, we took a look at live response and collected Volatile and Nonvolatile information from a live system. We also discussed different tools to collect information. We also discussed when we should to use a live response approach as an alternative to traditional forensics. Resources for Article: Further resources on this subject: BackTrack Forensics [article] Mobile Phone Forensics – A First Step into Android Forensics [article] Forensics Recovery [article]
Read more
  • 0
  • 0
  • 20650

article-image-containerizing-web-application-docker-part-1
Darwin Corn
10 Jun 2016
4 min read
Save for later

Containerizing a Web Application with Docker Part 1

Darwin Corn
10 Jun 2016
4 min read
Congratulations, you’ve written a web application! Now what? Part one of this post deals with steps to take after development, more specifically the creation of a Docker image that contains the application. In part two, I’ll lay out deploying that image to the Google Cloud Platform as well as some further reading that'll help you descend into the rabbit hole that is DevOps. For demonstration purposes, let’s say that you’re me and you want to share your adventures in TrapRap and Death Metal (not simultaneously, thankfully!) with the world. I’ve written a simple Ember frontend for this purpose, and through the course of this post, I will explain to you how I go about containerizing it. Of course, the beauty of this procedure is that it will work with any frontend application, and you are certainly welcome to Bring Your Own Code. Everything I use is publically available on GitHub, however, and you’re certainly welcome to work through this post with the material presented as well. So, I’ve got this web app. You can get it here, or you can run: $ git clone https://github.com/ndarwincorn/docker-demo.git Do this for wherever it is you keep your source code. You’ll need ember-cli and some familiarity with Ember to customize it yourself, or you can just cut to the chase and build the Docker image, which is what I’m going to do in this post. I’m using Docker 1.10, but there’s no reason this wouldn’t work on a Mac running Docker Toolbox (or even Boot2Docker, but don’t quote me on that) or a less bleeding edge Linux distro. Since installing Docker is well documented, I won’t get into that here and will continue with the assumption that you have a working, up-to-date Docker installed on your machine, and that the Docker daemon is running. If you’re working with your own app, feel free to skip below to my explanation of the process and then come back here when you’ve got a Dockerfile in the root of your application. In the root of the application, run the following (make sure you don’t have any locally-installed web servers listening on port 80 already): # docker build -t docker-demo . # docker run -d -p 80:80 --name demo docker-demo Once the command finishes by printing a container ID, launch a web browser and navigate to http://localhost. Hey! Now you can listen to my music served from a LXC container running on your very own computer. How did we accomplish this? Let’s take it piece-by-piece (here’s where to start reading again if you’ve approached this article with your own app): I created a simple Dockerfile using the official Nginx image because I have a deep-seated mistrust of Canonical and don’t want to use the Dockerfile here. Here’s what it looks like in my project: docker-demo/Dockerfile FROM nginx COPY dist/usr/share/nginx/html Running the docker build command reads the Dockerfile and uses it to configure a docker image based on the nginx image. During image configuration, it copies the contents of the dist folder in my project to /srv/http/docker-demo in the container, which the nginx configuration that was mentioned is pointed to. The -t flag tells Docker to ‘tag’ (name) the image we’ve just created as ‘docker-demo’. The docker run command takes that image and builds a container from it. The -d flag is short for ‘detach’, or run the /usr/bin/nginx command built into the image from our Dockerfile and leave the container running. The -p flag maps a port on the host to a port in the container, and --name names the container for later reference. The command should return a container ID that can be used to manipulate it later. In part two, I’ll show you how to push the image we created to the Google Cloud Platform and then launch it as a container in a specially-purposed VM on their Compute Engine. About the Author Darwin Corn is a Systems Analyst for the Consumer Direct Care Network. He is a mid-level professional with diverse experience in the Information Technology world.
Read more
  • 0
  • 0
  • 9350

article-image-setting-spark
Packt
10 Jun 2016
14 min read
Save for later

Setting up Spark

Packt
10 Jun 2016
14 min read
In this article by Alexander Kozlov, author of the book Mastering Scala Machine Learning, we will discuss how to download the pre-build Spark package from http://spark.apache.org/downloads.html,if you haven't done so yet. The latest release of  Spark, at the time of writing, is 1.6.1: Figure 3-1: The download site at http://spark.apache.org with recommended selections for this article (For more resources related to this topic, see here.) Alternatively, you can build the Spark by downloading the full source distribution from https://github.com/apache/spark: $ git clone https://github.com/apache/spark.git Cloning into 'spark'... remote: Counting objects: 301864, done. ... $ cd spark $sh ./ dev/change-scala-version.sh 2.11 ... $./make-distribution.sh --name alex-build-2.6-yarn --skip-java-test --tgz -Pyarn -Phive -Phive-thriftserver -Pscala-2.11 -Phadoop-2.6 ... The command will download the necessary dependencies and create the spark-2.0.0-SNAPSHOT-bin-alex-spark-build-2.6-yarn.tgz file in the Spark directory; the version is 2.0.0, as it is the next release version at the time of writing. In general, you do not want to build from trunk unless you are interested in the latest features. If you want a released version, you can visit the corresponding tag. Full list of available versions is available via the git branch –r command. The spark*.tgz file is all you need to run Spark on any machine that has Java JRE. The distribution comes with the docs/building-spark.md document that describes other options for building Spark and their descriptions, including incremental Scala compiler zinc. Full Scala 2.11 support is in the works for the next Spark 2.0.0 release. Applications Let's consider a few practical examples and libraries in Spark/Scala starting with a very traditional problem of word counting. Word count Most modern machine learning algorithms require multiple passes over data. If the data fits in the memory of a single machine, the data is readily available and this does not present a performance bottleneck. However, if the data becomes too large to fit into RAM, one has a choice of either dumping pieces of the data on disk (or database), which is about 100 times slower, but has a much larger capacity, or splitting the dataset between multiple machines across the network and transferring the results. While there are still ongoing debates, for most practical systems, analysis shows that storing the data over a set of network connected nodes has a slight advantage over repeatedly storing and reading it from hard disks on a single node, particularly if we can split the workload effectively between multiple CPUs. An average disk has bandwidth of about 100 MB/sec and transfers with a few mms latency, depending on the rotation speed and caching. This is about 100 times slower than reading the data from memory, depending on the data size and caching implementation again. Modern data bus can transfer data at over 10 GB/sec. While the network speed still lags behind the direct memory access, particularly with standard TCP/IP kernel networking layer overhead, specialized hardware can reach tens of GB/sec and if run in parallel, it can be potentially as fast as reading from the memory. In practice, the network-transfer speeds are somewhere between 1 to 10 GB/sec, but still faster than the disk in most practical systems. Thus, we can potentially fit the data into combined memory of all the cluster nodes and perform iterative machine learning algorithms across a system of them. One problem with memory, however, is that it is does not persist across node failures and reboots. A popular big data framework, Hadoop, made possible with the help of the original Dean/Ghemawat paper (Jeff Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, OSDI, 2004.), is using exactly the disk layer persistence to guarantee fault tolerance and store intermediate results. A Hadoop MapReduce program would first run a map function on each row of a dataset, emitting one or more key/value pairs. These key/value pairs then would be sorted, grouped, and aggregated by key so that the records with the same key would end up being processed together on the same reducer, which might be running on same or another node. The reducer applies a reduce function that traverses all the values that were emitted for the same key and aggregates them accordingly. The persistence of intermediate results would guarantee that if a reducer fails for one or another reason, the partial computations can be discarded and the reduce computation can be restarted from the checkpoint-saved results. Many simple ETL-like applications traverse the dataset only once with very little information preserved as state from one record to another. For example, one of the traditional applications of MapReduce is word count. The program needs to count the number of occurrences of each word in a document consisting of lines of text. In Scala, the word count is readily expressed as an application of the foldLeft method on a sorted list of words: val lines = scala.io.Source.fromFile("...").getLines.toSeq val counts = lines.flatMap(line => line.split("\W+")).sorted.   foldLeft(List[(String,Int)]()){ (r,c) =>     r match {       case (key, count) :: tail =>         if (key == c) (c, count+1) :: tail         else (c, 1) :: r         case Nil =>           List((c, 1))   } } If I run this program, the output will be a list of (word, count) tuples. The program splits the lines into words, sorts the words, and then matches each word with the latest entry in the list of (word, count) tuples. The same computation in MapReduce would be expressed as follows: val linesRdd = sc.textFile("hdfs://...") val counts = linesRdd.flatMap(line => line.split("\W+"))     .map(_.toLowerCase)     .map(word => (word, 1)).     .reduceByKey(_+_) counts.collect First, we need to process each line of the text by splitting the line into words and generation (word, 1) pairs. This task is easily parallelized. Then, to parallelize the global count, we need to split the counting part by assigning a task to do the count for a subset of words. In Hadoop, we compute the hash of the word and divide the work based on the value of the hash. Once the map task finds all the entries for a given hash, it can send the key/value pairs to the reducer, the sending part is usually called shuffle in MapReduce vernacular. A reducer waits until it receives all the key/value pairs from all the mappers, combines the values—a partial combine can also happen on the mapper, if possible—and computes the overall aggregate, which in this case is just sum. A single reducer will see all the values for a given word. Let's look at the log output of the word count operation in Spark (Spark is very verbose by default, you can manage the verbosity level by modifying the conf/log4j.properties file by replacing INFO with ERROR or FATAL): $ wget http://mirrors.sonic.net/apache/spark/spark-1.6.1/spark-1.6.1-bin-hadoop2.6.tgz $ tar xvf spark-1.6.1-bin-hadoop2.6.tgz $ cd spark-1.6.1-bin-hadoop2.6 $ mkdir leotolstoy $ (cd leotolstoy; wget http://www.gutenberg.org/files/1399/1399-0.txt) $ bin/spark-shell Welcome to       ____              __      / __/__  ___ _____/ /__     _ / _ / _ `/ __/  '_/    /___/ .__/_,_/_/ /_/_   version 1.6.1       /_/   Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40) Type in expressions to have them evaluated. Type :help for more information. Spark context available as sc. SQL context available as sqlContext. scala> val linesRdd = sc.textFile("leotolstoy", minPartitions=10) linesRdd: org.apache.spark.rdd.RDD[String] = leotolstoy MapPartitionsRDD[3] at textFile at <console>:27 At this stage, the only thing that happened is metadata manipulations, Spark has not touched the data itself. Spark estimates that the size of the dataset and the number of partitions. By default, this is the number of HDFS blocks, but we can specify the minimum number of partitions explicitly with the minPartitions parameter: scala> val countsRdd = linesRdd.flatMap(line => line.split("\W+")).      | map(_.toLowerCase).      | map(word => (word, 1)).      | reduceByKey(_+_) countsRdd: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[5] at reduceByKey at <console>:31 We just defined another RDD derived from the original linesRdd: scala> countsRdd.collect.filter(_._2 > 99) res3: Array[(String, Int)] = Array((been,1061), (them,841), (found,141), (my,794), (often,105), (table,185), (this,1410), (here,364), (asked,320), (standing,132), ("",13514), (we,592), (myself,140), (is,1454), (carriage,181), (got,277), (won,153), (girl,117), (she,4403), (moment,201), (down,467), (me,1134), (even,355), (come,667), (new,319), (now,872), (upon,207), (sister,115), (veslovsky,110), (letter,125), (women,134), (between,138), (will,461), (almost,124), (thinking,159), (have,1277), (answer,146), (better,231), (men,199), (after,501), (only,654), (suddenly,173), (since,124), (own,359), (best,101), (their,703), (get,304), (end,110), (most,249), (but,3167), (was,5309), (do,846), (keep,107), (having,153), (betsy,111), (had,3857), (before,508), (saw,421), (once,334), (side,163), (ough... Word count over 2 GB of text data—40,291 lines and 353,087 words—took under a second to read, split, and group by words. With extended logging, you could see the following: Spark opens a few ports to communicate with the executors and users Spark UI runs on port 4040 on http://localhost:4040 You can read the file either from local or distributed storage (HDFS, Cassandra, and S3) Spark will connect to Hive if Spark is built with Hive support Spark uses lazy evaluation and executes the pipeline only when necessary or when output is required Spark uses internal scheduler to split the job into tasks, optimize the execution, and execute the tasks The results are stored into RDDs, which can either be saved or brought into RAM of the node executing the shell with collect method The art of parallel performance tuning is to split the workload between different nodes or threads so that the overhead is relatively small and the workload is balanced. Streaming word count Spark supports listening on incoming streams, partitioning it, and computing aggregates close to real-time. Currently supported sources are Kafka, Flume, HDFS/S3, Kinesis, Twitter, as well as the traditional MQs such as ZeroMQ and MQTT. In Spark, streaming is implemented as micro-batches. Internally, Spark divides input data into micro-batches, usually from subseconds to minutes in size and performs RDD aggregation operations on these micro-batches. For example, let's extend the Flume example that we covered earlier. We'll need to modify the Flume configuration file to create a Spark polling sink. Instead of HDFS, replace the sink section: # The sink is Spark a1.sinks.k1.type=org.apache.spark.streaming.flume.sink.SparkSink a1.sinks.k1.hostname=localhost a1.sinks.k1.port=4989 Now, instead of writing to HDFS, Flume will wait for Spark to poll for data: object FlumeWordCount {   def main(args: Array[String]) {     // Create the context with a 2 second batch size     val sparkConf = new SparkConf().setMaster("local[2]")       .setAppName("FlumeWordCount")     val ssc = new StreamingContext(sparkConf, Seconds(2))     ssc.checkpoint("/tmp/flume_check")     val hostPort=args(0).split(":")     System.out.println("Opening a sink at host: [" + hostPort(0) +       "] port: [" + hostPort(1).toInt + "]")     val lines = FlumeUtils.createPollingStream(ssc, hostPort(0),       hostPort(1).toInt, StorageLevel.MEMORY_ONLY)     val words = lines       .map(e => new String(e.event.getBody.array)).         map(_.toLowerCase).flatMap(_.split("\W+"))       .map(word => (word, 1L))       .reduceByKeyAndWindow(_+_, _-_, Seconds(6),         Seconds(2)).print     ssc.start()     ssc.awaitTermination()   } } To run the program, start the Flume agent in one window: $ ./bin/flume-ng agent -Dflume.log.level=DEBUG,console -n a1 –f ../chapter03/conf/flume-spark.conf ... Then run the FlumeWordCount object in another: $ cd ../chapter03 $ sbt "run-main org.akozlov.chapter03.FlumeWordCount localhost:4989 ... Now, any text typed to the netcat connection will be split into words and counted every two seconds for a six second sliding window: $ echo "Happy families are all alike; every unhappy family is unhappy in its own way" | nc localhost 4987 ... ------------------------------------------- Time: 1464161488000 ms ------------------------------------------- (are,1) (is,1) (its,1) (family,1) (families,1) (alike,1) (own,1) (happy,1) (unhappy,2) (every,1) ...   ------------------------------------------- Time: 1464161490000 ms ------------------------------------------- (are,1) (is,1) (its,1) (family,1) (families,1) (alike,1) (own,1) (happy,1) (unhappy,2) (every,1) ... Spark/Scala allows to seamlessly switch between the streaming sources. For example, the same program for Kafka publish/subscribe topic model looks similar to the following: object KafkaWordCount {   def main(args: Array[String]) {     // Create the context with a 2 second batch size     val sparkConf = new SparkConf().setMaster("local[2]")       .setAppName("KafkaWordCount")     val ssc = new StreamingContext(sparkConf, Seconds(2))     ssc.checkpoint("/tmp/kafka_check")     System.out.println("Opening a Kafka consumer at zk:       [" + args(0) + "] for group group-1 and topic example")     val lines = KafkaUtils.createStream(ssc, args(0), "group-1",       Map("example" -> 1), StorageLevel.MEMORY_ONLY)     val words = lines       .flatMap(_._2.toLowerCase.split("\W+"))       .map(word => (word, 1L))       .reduceByKeyAndWindow(_+_, _-_, Seconds(6),         Seconds(2)).print     ssc.start()     ssc.awaitTermination()   } } To start the Kafka broker, first download the latest binary distribution and start ZooKeeper. ZooKeeper is a distributed-services coordinator and is required by Kafka even in a single-node deployment: $ wget http://apache.cs.utah.edu/kafka/0.9.0.1/kafka_2.11-0.9.0.1.tgz ... $ tar xf kafka_2.11-0.9.0.1.tgz $ bin/zookeeper-server-start.sh config/zookeeper.properties ... In another window, start the Kafka server: $ bin/kafka-server-start.sh config/server.properties ... Run the KafkaWordCount object: $ $ sbt "run-main org.akozlov.chapter03.KafkaWordCount localhost:2181" ... Now, publishing the stream of words into the Kafka topic will produce the window counts: $ echo "Happy families are all alike; every unhappy family is unhappy in its own way" | ./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic example ... $ sbt "run-main org.akozlov.chapter03.FlumeWordCount localhost:4989 ... ------------------------------------------- Time: 1464162712000 ms ------------------------------------------- (are,1) (is,1) (its,1) (family,1) (families,1) (alike,1) (own,1) (happy,1) (unhappy,2) (every,1) As you see, the programs output every two seconds. Spark streaming is sometimes called micro-batch processing. Streaming has many other applications (and frameworks), but this is too big of a topic to be entirely considered here and needs to be covered separately. Spark SQL and DataFrame DataFrame was a relatively recent addition to Spark, introduced in version 1.3, allowing one to use the standard SQL language for data analysis. SQL is really great for simple exploratory analysis and data aggregations. According to the latest poll results, about 70% of Spark users use DataFrame. Although DataFrame recently became the most popular framework for working with tabular data, it is relatively a heavyweight object. The pipelines that use DataFrames may execute much slower than the ones that are based on Scala's vector or LabeledPoint, which will be discussed in the next chapter. The evidence from different developers is that the response times can be driven to tens or hundreds of milliseconds, depending on the query from submillisecond on simpler objects. Spark implements its own shell for SQL, which can be invoked additionally to the standard Scala REPL shell: ./bin/spark-sql can be used to access the existing Hive/Impala or relational DB tables: $ ./bin/spark-sql … spark-sql> select min(duration), max(duration), avg(duration) from kddcup; … 0  58329  48.34243046395876 Time taken: 11.073 seconds, Fetched 1 row(s) In standard Spark's REPL, the same query can performed by running the following command: $ ./bin/spark-shell … scala> val df = sqlContext.sql("select min(duration), max(duration), avg(duration) from kddcup" 16/05/12 13:35:34 INFO parse.ParseDriver: Parsing command: select min(duration), max(duration), avg(duration) from alex.kddcup_parquet 16/05/12 13:35:34 INFO parse.ParseDriver: Parse Completed df: org.apache.spark.sql.DataFrame = [_c0: bigint, _c1: bigint, _c2: double] scala> df.collect.foreach(println) … 16/05/12 13:36:32 INFO scheduler.DAGScheduler: Job 2 finished: collect at <console>:22, took 4.593210 s [0,58329,48.34243046395876] Summary In this article, we discussed Spark and Hadoop and their relationship with Scala. We also discussed functional programming at a very high level. We then considered a classic word count example and it's implementation in Scala and Spark. Resources for Article: Further resources on this subject: Holistic View on Spark [article] Exploring Scala Performance [article] Getting Started with Apache Hadoop and Apache Spark [article]
Read more
  • 0
  • 0
  • 2471
article-image-clustering-methods
Packt
09 Jun 2016
18 min read
Save for later

Clustering Methods

Packt
09 Jun 2016
18 min read
In this article by Magnus Vilhelm Persson, author of the book Mastering Python Data Analysis, we will see that with data comprising of several separated distributions, how do we find and characterize them? In this article, we will look at some ways to identify clusters in data. Groups of points with similar characteristics form clusters. There are many different algorithms and methods to achieve this, with good and bad points. We want to detect multiple separate distributions in the data; for each point, we determine the degree of association (or similarity) with another point or cluster. The degree of association needs to be high if they belong in a cluster together or low if they do not. This can, of course, just as previously, be a one-dimensional problem or a multidimensional problem. One of the inherent difficulties of cluster finding is determining how many clusters there are in the data. Various approaches to define this exist—some where the user needs to input the number of clusters and then the algorithm finds which points belong to which cluster, and some where the starting assumption is that every point is a cluster and then two nearby clusters are combined iteratively on trial basis to see if they belong together. In this article, we will cover the following topics: A short introduction to cluster finding, reminding you of the general problem and an algorithm to solve it Analysis of a dataset in the context of cluster finding, the Cholera outbreak in central London 1854 By Simple zeroth order analysis, calculating the centroid of the whole dataset By finding the closest water pump for each recorded Cholera-related death Applying the K-means nearest neighbor algorithm for cluster finding to the data and identifying two separate distributions (For more resources related to this topic, see here.) The algorithms and methods covered here are focused on those available in SciPy. Start a new Notebook, and put in the default imports. Perhaps you want to change to interactive Notebook plotting to try it out a bit more. For this article, we are adding the following specific imports. The ones related to clustering are from SciPy, while later on we will need some packages to transform astronomical coordinates. These packages are all preinstalled in the Anaconda Python 3 distribution and have been tested there: import scipy.cluster.hierarchy as hac import scipy.cluster.vq as vq Introduction to cluster finding There are many different algorithms for cluster identification. Many of them try to solve a specific problem in the best way. Therefore, the specific algorithm that you want to use might depend on the problem you are trying to solve and also on what algorithms are available in the specific package that you are using. Some of the first clustering algorithms consisted of simply finding the centroid positions that minimize the distances to all the points in each cluster. The points in each cluster are closer to that centroid than other cluster centroids. As might be obvious at this point, the hardest part with this is figuring out how many clusters there are. If we can determine this, it is fairly straightforward to try various ways of moving the cluster centroid around, calculate the distance to each point, and then figure out where the cluster centroids are. There are also obvious situations where this might not be the best solution, for example, if you have two very elongated clusters next to each other. Commonly, the distance is the Euclidean distance: Here, p is a vector with all the points' positions,that is,{p1,p2,...,pN–1,pN} in cluster Ck, that is P E Ck , the distances are calculated from the cluster centroid,Ui . We have to find the cluster centroids that minimize the sum of the absolute distances to the points: In this first example, we shall first work with fixed cluster centroids. Starting out simple – John Snow on Cholera In 1854, there was an outbreak of cholera in North-western London, in the neighborhood around Broad Street. The leading theories at the time claimed that cholera spread, just like it was believed that the plague spread: through foul, bad air. John Snow, a physician at the time, hypothesized that cholera spread through drinking water. During the outbreak, John tracked the deaths and drew them on a map of the area. Through his analysis, he concluded that most of the cases were centered on the Broad Street water pump. Rumors say that he then removed the handle of the water pump, thus stopping an epidemic. Today, we know that cholera is usually transmitted through contaminated food or water, thus confirming John's hypothesis. We will do a short but instructive reanalysis of John Snow's data. The data comes from the public data archives of The National Center for Geographic Information and Analysis (http://www.ncgia.ucsb.edu/ and http://www.ncgia.ucsb.edu/pubs/data.php). A cleaned up map and copy of the data files along with an example of Geospatial information analysis of the data can also be found at https://www.udel.edu/johnmack/frec682/cholera/cholera2.html.A wealth of information about physician and scientist John Snow's life and works can be found at http://johnsnow.matrix.msu.edu. To start the analysis, we read the data into a Pandas DataFrame; the data is already formatted in CSV-files readable by Pandas: deaths = pd.read_csv('data/cholera_deaths.txt') pumps = pd.read_csv('data/cholera_pumps.txt') Each file contains two columns, one for X coordinates and one for Y coordinates. Let's check what it looks like: deaths.head() pumps.head() With this information, we can now plot all the pumps and deaths to visualize the data: plt.figure(figsize=(4,3.5)) plt.plot(deaths['X'], deaths['Y'], marker='o', lw=0, mew=1, mec='0.9', ms=6) plt.plot(pumps['X'],pumps['Y'], marker='s', lw=0, mew=1, mec='0.9', color='k', ms=6) plt.axis('equal') plt.xlim((4.0,22.0)); plt.xlabel('X-coordinate') plt.ylabel('Y-coordinate') plt.title('John Snow's Cholera') It is fairly easy to see that the pump in the middle is important. As a first data exploration, we will simply calculate the mean centroid of the distribution and plot that in the figure as an ellipse. We will calculate the mean and standard deviation along the x and y axis as the centroid position: fig = plt.figure(figsize=(4,3.5)) ax = fig.add_subplot(111) plt.plot(deaths['X'], deaths['Y'], marker='o', lw=0, mew=1, mec='0.9', ms=6) plt.plot(pumps['X'],pumps['Y'], marker='s', lw=0, mew=1, mec='0.9', color='k', ms=6) from matplotlib.patches import Ellipse ellipse = Ellipse(xy=(deaths['X'].mean(), deaths['Y'].mean()), width=deaths['X'].std(), height=deaths['Y'].std(), zorder=32, fc='None', ec='IndianRed', lw=2) ax.add_artist(ellipse) plt.plot(deaths['X'].mean(), deaths['Y'].mean(), '.', ms=10, mec='IndianRed', zorder=32) for i in pumps.index: plt.annotate(s='{0}'.format(i), xy=(pumps[['X','Y']].loc[i]), xytext=(-15,6), textcoords='offset points') plt.axis('equal') plt.xlim((4.0,22.5)) plt.xlabel('X-coordinate') plt.ylabel('Y-coordinate') plt.title('John Snow's Cholera') Here, we also plotted the pump index, which we can get from DataFrame with the pumps.index method. The next step in the analysis is to see which pump is the closest to each point. We do this by calculating the distance from all pumps to all points. Then we want to figure out which pump is the closest for each point. We save the closest pump to each point in a separate column of the deaths' DataFrame. With this dataset, the for-loop runs fairly quickly. However, the DataFrame subtract method chained with sum() and idxmin() methods takes a few seconds. I strongly encourage you to play around with various ways to speed this up. We also use the .apply() method of DataFrame to square and square root the values. The simple brute force first attempt of this took over a minute to run. The built-in functions and methods help a lot: deaths_tmp = deaths[['X','Y']].as_matrix() idx_arr = np.array([], dtype='int') for i in range(len(deaths)): idx_arr = np.append(idx_arr, (pumps.subtract(deaths_tmp[i])).apply(lambda x:x**2).sum(axis=1).apply(lambda x:x**0.5).idxmin()) deaths['C'] = idx_arr Quickly check whether everything seems fine by printing out the top rows of the table: deaths.head() Now we want to visualize what we have. With colors, we can show which water pump we associate each death with. To do this, we use a colormap, in this case, the jet colormap. By calling the colormap with a value between 0 and 1, it returns a color; thus we give it the pump indexes and then divide it with the total number of pumps, 12 in our case: fig = plt.figure(figsize=(4,3.5)) ax = fig.add_subplot(111) np.unique(deaths['C'].values) plt.scatter(deaths['X'].as_matrix(), deaths['Y'].as_matrix(), color=plt.cm.jet(deaths['C']/12.), marker='o', lw=0.5, edgecolors='0.5', s=20) plt.plot(pumps['X'],pumps['Y'], marker='s', lw=0, mew=1, mec='0.9', color='0.3', ms=6) for i in pumps.index: plt.annotate(s='{0}'.format(i), xy=(pumps[['X','Y']].loc[i]), xytext=(-15,6), textcoords='offset points', ha='right') ellipse = Ellipse(xy=(deaths['X'].mean(), deaths['Y'].mean()), width=deaths['X'].std(), height=deaths['Y'].std(), zorder=32, fc='None', ec='IndianRed', lw=2) ax.add_artist(ellipse) plt.axis('equal') plt.xlim((4.0,22.5)) plt.xlabel('X-coordinate') plt.ylabel('Y-coordinate') plt.title('John Snow's Cholera') The majority of deaths are dominated by the proximity of the pump in the center. This pump is located on Broad Street. Now, remember that we have used fixed positions for the cluster centroids. In this case, we are basically working on the assumption that the water pumps are related to the cholera cases. Furthermore, the Euclidean distance is not really the real-life distance. People go along roads to get water and the road there is not necessarily straight. Thus, one would have to map out the streets and calculate the distance to each pump from that. Even so, already at this level, it is clear that there is something with the center pump related to the cholera cases. How would you account for the different distance? To calculate the distance, you would do what is called cost-analysis (c.f. when you hit directions on your sat-nav to go to a place). There are many different ways of doing cost analysis, and it also relates to the problem of finding the correct way through a maze. In addition to these things, we do not have any data in the time domain, that is, the cholera would possibly spread to other pumps with time and thus the outbreak might have started at the Broad Street pump and spread to other nearby pumps. Without time data, it is difficult to figure out what happened. This is the general approach to cluster finding. The coordinates might be attributes instead, length and weight of dogs for example, and the location of the cluster centroid something that we would iteratively move around until we find the best position. K-means clustering The K-means algorithm is also referred to as vector quantization. What the algorithm does is find the cluster (centroid) positions that minimize the distances to all points in the cluster. This is done iteratively; the problem with the algorithm is that it can be a bit greedy, meaning that it will find the nearest minima quickly. This is generally solved with some kind of basin-hopping approach where the nearest minima found is randomly perturbed and the algorithm restarted. Due to this fact, the algorithm is dependent on good initial guesses as input. Suicide rate versus GDP versus absolute lattitude We will analyze the data of suicide rates versus GDP versus absolute lattitude or Degrees From Equator (DFE) for clusters. Our hypothesis from the visual inspection was that there were at least two distinct clusters, one with higher suicide rate, GDP, and absolute lattitude and one with lower. Here, the Hierarchical Data Format (HDF) file is now read in as a DataFrame. This time, we want to discard all the rows where one or more column entries are NaN or empty. Thus, we use the appropriate DataFrame method for this: TABLE_FILE = 'data/data_ch4.h5' d2 = pd.read_hdf(TABLE_FILE) d2 = d2.dropna() Next, while the DataFrame is a very handy format, which we will utilize later on, the input to the cluster algorithms in SciPy do not handle Pandas data types natively. Thus, we transfer the data to a NumPy array: rates = d2[['DFE','GDP_CD','Both']].as_matrix().astype('float') Next, to recap, we visualise the data by one histogram of the GDP and one scatter plot for all the data. We do this to aid us in the initial guesses of the cluster centroid positions: plt.subplots(12, figsize=(8,3.5)) plt.subplot(121) plt.hist(rates.T[1], bins=20,color='SteelBlue') plt.xticks(rotation=45, ha='right') plt.yscale('log') plt.xlabel('GDP') plt.ylabel('Counts') plt.subplot(122) plt.scatter(rates.T[0], rates.T[2], s=2e5*rates.T[1]/rates.T[1].max(), color='SteelBlue', edgecolors='0.3'); plt.xlabel('Absolute Latitude (Degrees, 'DFE')') plt.ylabel('Suicide Rate (per 100')') plt.subplots_adjust(wspace=0.25); The scatter plots shows the GDP as size. The function to run the clustering k-means takes a special kind of normalized input. The data arrays (columns) have to be normalized by the standard deviation of the array. Although this is straightforward, there is a function included in the module called whiten. It will scale the data with the standard deviation: w = vq.whiten(rates) To show what it does to the data, we plot the same plots as we did previously, but with the output from the whiten function: plt.subplots(12, figsize=(8,3.5)) plt.subplot(121) plt.hist(w[:,1], bins=20, color='SteelBlue') plt.yscale('log') plt.subplot(122) plt.scatter(w.T[0], w.T[2], s=2e5*w.T[1]/w.T[1].max(), color='SteelBlue', edgecolors='0.3') plt.xticks(rotation=45, ha='right'); As you can see, all the data is scaled from the previous figure. However, as mentioned, the scaling is just the standard deviation. So let's calculate the scaling and save it to the sc variable: sc = rates.std(axis=0) Now we are ready to estimate the initial guesses for the cluster centroids. Reading off the first plot of the data, we guess the centroids to be at 20 DFE, 200,000 GDP, and 10 suicides and the second at 45 DFE, 100,000 GDP, and 15 suicides. We put this in an array and scale it with our scale parameter to the same scale as the output from the whiten function. This is then sent to the kmeans2 function of SciPy: init_guess = np.array([[20,20E3,10],[45,100E3,15]]) init_guess /= sc z2_cb, z2_lbl = vq.kmeans2(w, init_guess, minit='matrix', iter=500) There is another function, kmeans (without the 2), which is a less complex version and does not stop iterating when it reaches a local minima. It stops when the changes between two iterations go below some level. Thus, the standard K-means algorithm is represented in SciPy by the kmeans2 function. The function outputs the centroids' scaled positions (here z2_cb) and a lookup table (z2_lbl) telling us which row belongs to which centroid. To get the centroid positions in units we understand, we simply multiply with our scaling value: z2_cb_sc = z2_cb * sc At this point, we can plot the results. The following section is rather long and contains many different parts so we will go through them section by section. However, the code should be run in one cell of the Notebook: # K-means clustering figure START plt.figure(figsize=(6,4)) plt.scatter(z2_cb_sc[0,0], z2_cb_sc[0,2], s=5e2*z2_cb_sc[0,1]/rates.T[1].max(), marker='+', color='k', edgecolors='k', lw=2, zorder=10, alpha=0.7); plt.scatter(z2_cb_sc[1,0], z2_cb_sc[1,2], s=5e2*z2_cb_sc[1,1]/rates.T[1].max(), marker='+', color='k', edgecolors='k', lw=3, zorder=10, alpha=0.7); The first steps are quite simple—we set up the figure size and plot the points of the cluster centroids. We hypothesized about two clusters; thus, we plot them with two different calls to plt.scatter. Here, z2_cb_sc[1,0] gets the second cluster x-coordinate (DFE); then switching 0 for 1 gives us the y coordinate (rate). We set the size of the marker s-parameter to scale with the value of the third data axis, the GDP. We also do this further down for the data, just as in previous plots, so that it is easier to compare and differentiate the clusters. The zorder keyword gives the order in depth of the elements that are plotted; a high zorder will put them on top of everything else and a negative zorder will send them to the back. s0 = abs(z2_lbl==0).astype('bool') s1 = abs(z2_lbl==1).astype('bool') pattern1 = 5*'x' pattern2 = 4*'/' plt.scatter(w.T[0][s0]*sc[0], w.T[2][s0]*sc[2], s=5e2*rates.T[1][s0]/rates.T[1].max(), lw=1, hatch=pattern1, edgecolors='0.3', color=plt.cm.Blues_r( rates.T[1][s0]/rates.T[1].max())); plt.scatter(rates.T[0][s1], rates.T[2][s1], s=5e2*rates.T[1][s1]/rates.T[1].max(), lw=1, hatch=pattern2, edgecolors='0.4', marker='s', color=plt.cm.Reds_r( rates.T[1][s1]/rates.T[1].max()+0.4)) In this section, we plot the points of the clusters. First, we get the selection (Boolean) arrays. They are simply found by setting all indexes that refer to cluster 0 to True and all else to False; this gives us the Boolean array for cluster 0 (the first cluster). The second Boolean array is matched for the second cluster (cluster 1). Next, we define the hatch pattern for the scatter plot markers, which we later give as input to the plotting function. The multiplier for the hatch pattern gives the density of the pattern. The scatter plots for the points are created in a similar fashion as the centroids, except that the markers are a bit more complex. They are both color-coded, like in the previous example with Cholera deaths, but in a gradient instead of the exact same colors for all points. The gradient is defined by the GDP, which also defines the size of the points. The x and y data sent to the plot is different between the clusters, but they access the same data in the end because we multiply with our scaling factor. p1 = plt.scatter([],[], hatch='None', s=20E3*5e2/rates.T[1].max(), color='k', edgecolors='None',) p2 = plt.scatter([],[], hatch='None', s=40E3*5e2/rates.T[1].max(), color='k', edgecolors='None',) p3 = plt.scatter([],[], hatch='None', s=60E3*5e2/rates.T[1].max(), color='k', edgecolors='None',) p4 = plt.scatter([],[], hatch='None', s=80E3*5e2/rates.T[1].max(), color='k', edgecolors='None',) labels = ["20'", "40'", "60'", ">80'"] plt.legend([p1, p2, p3, p4], labels, ncol=1, frameon=True, #fontsize=12, handlelength=1, loc=1, borderpad=0.75,labelspacing=0.75, handletextpad=0.75, title='GDP', scatterpoints=1.5) plt.ylim((-4,40)) plt.xlim((-4,80)) plt.title('K-means clustering') plt.xlabel('Absolute Lattitude (Degrees, 'DFE')') plt.ylabel('Suicide Rate (per 100 000)'); The last tweak to the plot is made by creating a custom legend. We want to show different sizes of the points and what GDP they correspond to. As there is a continuous gradient from low to high, we cannot use the plotted points. Thus we create our own, but leave the x and y input coordinates as empty lists. This will not show anything in the plot but we can use them to register in the legend. The various tweaks to the legend function controls different aspects of the legend layout. I encourage you to experiment with it to see what happens: As for the final analysis, two different clusters are identified. Just as our previous hypothesis, there is a cluster with a clear linear trend with relatively higher GDP, which is also located at higher absolute latitude. Although the identification is rather weak, it is clear that the two groups are separated. Countries with low GDP are clustered closer to the equator. What happens when you add more clusters? Try to add a cluster for the low DFE, high rate countries, visualize it, and think about what this could mean for the conclusion(s). Summary In this article, we identified clusters using methods such as finding the centroid positions and K-means clustering. For more information about Python Data Analysis, refer to the following books by Packt Publishing: Python Data Analysis (https://www.packtpub.com/big-data-and-business-intelligence/python-data-analysis) Getting Started with Python Data Analysis (https://www.packtpub.com/big-data-and-business-intelligence/getting-started-python-data-analysis)   Resources for Article: Further resources on this subject: Python Data Science Up and Running [article] Basics of Jupyter Notebook and Python [article] Scientific Computing APIs for Python [article]
Read more
  • 0
  • 0
  • 2818

article-image-angular-2-components-what-you-need-know
David Meza
08 Jun 2016
10 min read
Save for later

Angular 2 Components: What You Need to Know

David Meza
08 Jun 2016
10 min read
From the 7th to the 13th of November you can save up to 80% on some of our very best Angular content - along with our hottest React eBooks and video courses. If you're curious about the cutting-edge of modern web development we think you should click here and invest in your skills... The Angular team introduced quite a few changes in version 2 of the framework, and components are one of the important ones. If you are familiar with Angular 1 applications, components are actually a form of directives that are extended with template-oriented features. In addition, components are optimized for better performance and simpler configuration than a directive as Angular doesn’t support all its features. Also, while a component is technically a directive, it is so distinctive and central to Angular 2 applications that you’ll find that it is often separated as a different ingredient for the architecture of an application. So, what is a component? In simple words, a component is a building block of an application that controls a part of your screen real estate or your “view”. It does one thing, and it does it well. For example, you may have a component to display a list of active chats in a messaging app (which, in turn, may have child components to display the details of the chat or the actual conversation). Or you may have an input field that uses Angular’s two-way data binding to keep your markup in sync with your JavaScript code. Or, at the most elementary level, you may have a component that substitutes an HTML template with no special functionality just because you wanted to break down something complex into smaller, more manageable parts. Now, I don’t believe too much in learning something by only reading about it, so let’s get your hands dirty and write your own component to see some sample usage. I will assume that you already have Typescript installed and have done the initial configuration required for any Angular 2 app. If you haven’t, you can check out how to do so by clicking on this link. You may have already seen a component at its most basic level: import {Component} from 'angular2/core'; @Component({ selector: 'my-app', template: '<h1>{{ title }}</h1>' }) export class AppComponent { title = 'Hello World!'; } That’s it! That’s all you really need to have a component. Three things are happening here: You are importing the Component class from the Angular 2 core package. You are using a Typescript decorator to attach some metadata to your AppComponent class. If you don’t know what a decorator is, it is simply a function that extends your class with Angular code so that it becomes an Angular component. Otherwise, it would just be a plain class with no relation to the Angular framework. In the options, you defined a selector, which is the tag name used in the HTML code so that Angular can find where to insert your component, and a template, which is applied to the inner contents of the selector tag. You may notice that we also used interpolation to bind the component data and display the value of the public variable in the template. You are exporting your AppComponent class so that you can import it elsewhere (in this case, you would import it in your main script so that you can bootstrap your application). That’s a good start, but let’s get into a more complex example that showcases other powerful features of Angular and Typescript/ES2015. In the following example, I've decided to stuff everything into one component. However, if you'd like to stick to best practices and divide the code into different components and services or if you get lost at any point, you can check out the finished/refactored example here. Without any further ado, let’s make a quick page that displays a list of products. Let’s start with the index: <html> <head> <title>Products</title> <meta name="viewport" content="width=device-width, initial-scale=1"> <script src="node_modules/es6-shim/es6-shim.min.js"></script> <script src="node_modules/systemjs/dist/system-polyfills.js"></script> <script src="node_modules/angular2/bundles/angular2-polyfills.js"></script> <script src="node_modules/systemjs/dist/system.src.js"></script> <script src="node_modules/rxjs/bundles/Rx.js"></script> <script src="node_modules/angular2/bundles/angular2.dev.js"></script> <link rel="stylesheet" href="styles.css"> <script> System.config({ packages: { app: { format: 'register', defaultExtension: 'js' } } }); System.import('app/main') .then(null, console.error.bind(console)); </script> </head> <body> <my-app>Loading...</my-app> </body> </html> There’s nothing out of the ordinary going on here. You are just importing all of the necessary scripts for your application to work as demonstrated in the quick-start. The app/main.ts file should already look somewhat similar to this: import {bootstrap} from ‘angular2/platform/browser’ import {AppComponent} from ‘./app.component’ bootstrap(AppComponent); Here, we imported the bootstrap function from the Angular 2 package and an AppComponent class from the local directory. Then, we initialized the application. First, create a product class that defines the constructor and type definition of any products made. Then, create app/product.ts, as follows: export class Product { id: number; price: number; name: string; } Next, you will create an app.component.ts file, which is where the magic happens. I've decided to stuff everything in here for demonstration purposes, but ideally, you would want to extract the products array into its own service, the HTML template into its own file, and the product details into its own component. This is how the component will look: import {Component} from 'angular2/core'; import {Product} from './product' @Component({ selector: 'my-app', template: ` <h1>{{title}}</h1> <ul class="products"> <li *ngFor="#product of products" [class.selected]="product === selectedProduct" (click)="onSelect(product)"> <span class="badge">{{product.id}}</span> {{product.name}} </li> </ul> <div *ngIf="selectedProduct"> <h2>{{selectedProduct.name}} details!</h2> <div><label>id: </label>{{selectedProduct.id}}</div> <div><label>Price: </label>{{selectedProduct.price | currency: 'USD': true }}</div> <div> <label>name: </label> <input [(ngModel)]="selectedProduct.name" placeholder="name"/> </div> </div> `, styleUrls: ['app/app.component.css'] }) export class AppComponent { title = 'My Products'; products = PRODUCTS; selectedProduct: Product; onSelect(product: Product) { this.selectedProduct = product; } } const PRODUCTS: Product[] = [ { "id": 1, "price": 45.12, "name": "TV Stand" }, { "id": 2, "price": 25.12, "name": "BBQ Grill" }, { "id": 3, "price": 43.12, "name": "Magic Carpet" }, { "id": 4, "price": 12.12, "name": "Instant liquidifier" }, { "id": 5, "price": 9.12, "name": "Box of puppies" }, { "id": 6, "price": 7.34, "name": "Laptop Desk" }, { "id": 7, "price": 5.34, "name": "Water Heater" }, { "id": 8, "price": 4.34, "name": "Smart Microwave" }, { "id": 9, "price": 93.34, "name": "Circus Elephant" }, { "id": 10, "price": 87.34, "name": "Tinted Window" } ]; The app/app.component.css file will look something similar to this: .selected { background-color: #CFD8DC !important; color: white; } .products { margin: 0 0 2em 0; list-style-type: none; padding: 0; width: 15em; } .products li { position: relative; min-height: 2em; cursor: pointer; position: relative; left: 0; background-color: #EEE; margin: .5em; padding: .3em 0; border-radius: 4px; font-size: 16px; overflow: hidden; white-space: nowrap; text-overflow: ellipsis; color: #3F51B5; display: block; width: 100%; -webkit-transition: all 0.3s ease; -moz-transition: all 0.3s ease; -o-transition: all 0.3s ease; -ms-transition: all 0.3s ease; transition: all 0.3s ease; } .products li.selected:hover { background-color: #BBD8DC !important; color: white; } .products li:hover { color: #607D8B; background-color: #DDD; left: .1em; color: #3F51B5; text-decoration: none; font-size: 1.2em; background-color: rgba(0,0,0,0.01); } .products .text { position: relative; top: -3px; } .products .badge { display: inline-block; font-size: small; color: white; padding: 0.8em 0.7em 0 0.7em; background-color: #607D8B; line-height: 1em; position: relative; left: -1px; top: 0; height: 2em; margin-right: .8em; border-radius: 4px 0 0 4px; } I'll explain what is happening: We imported from Component so that we can decorate your new component and imported Product so that we can create an array of products and have access to Typescript type infererences. We decorated our component with a “my-app” selector property, which finds <my-app></my-app> tags and inserts our component there. I decided to define the template in this file instead of using a URL so that I can demonstrate how handy the ES2015 template string syntax is (no more long strings or plus-separated strings). Finally, the styleUrls property uses an absolute file path, and any styles applied will only affect the template in this scope. The actual component only has a few properties outside of the decorator configuration. It has a title that you can bind to the template, a products array that will iterate in the markup, a selectedProduct variable that is a scope variable that will initialize as undefined and an onSelect method that will be run every time you click on a list item. Finally, define a constant (const because I've hardcoded it in and it won't change in runtime) PRODUCTS array to mock an object that is usually returned by a service after an external request. Also worth noting are the following: As you are using Typescript, you can make inferences about what type of data your variables will hold. For example, you may have noticed that I defined the Product type whenever I knew that this the only kind of object I want to allow for this variable or to be passed to a function. Angular 2 has different property prefixes, and if you would like to learn when to use each one, you can check out this Stack Overflow question. That's it! You now have a bit more complex component that has a particular functionality. As I previously mentioned, this could be refactored, and that would look something similar to this: import {Component, OnInit} from 'angular2/core'; import {Product} from './product'; import {ProductDetailComponent} from './product-detail.component'; import {ProductService} from './product.service'; @Component({ selector: 'my-app', templateUrl: 'app/app.component.html', styleUrls: ['app/app.component.css'], directives: [ProductDetailComponent], providers: [ProductService] }) export class AppComponent implements OnInit { title = 'Products'; products: Product[]; selectedProduct: Product; constructor(private _productService: ProductService) { } getProducts() { this._productService.getProducts().then(products => this.products = products); } ngOnInit() { this.getProducts(); } onSelect(product: Product) { this.selectedProduct = product; } } In this example, you get your product data from a service and separate the product detail template into a child component, which is much more modular. I hope you've enjoyed reading this post. About this author David Meza is an AngularJS developer at the City of Raleigh. He is passionate about software engineering and learning new programming languages and frameworks. He is most familiar working with Ruby, Rails, and PostgreSQL in the backend and HTML5, CSS3, JavaScript, and AngularJS in the frontend. He can be found at here.
Read more
  • 0
  • 0
  • 14527
Modal Close icon
Modal Close icon