Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-cloudtrail-logs-amazon-elasticsearch
Vijin Boricha
10 May 2018
5 min read
Save for later

Analyzing CloudTrail Logs using Amazon Elasticsearch

Vijin Boricha
10 May 2018
5 min read
Log management and analysis for many organizations start and end with just three letters: E, L, and K, which stands for Elasticsearch, Logstash, and Kibana. In today's tutorial, we will learn about analyzing CloudTrail logs which are E, L and K. [box type="shadow" align="" class="" width=""]This tutorial is an excerpt from the book AWS Administration - The Definitive Guide - Second Edition written by Yohan Wadia. This book will help you enhance your application delivery skills with the latest AWS services while also securing and monitoring the environment workflow.[/box] The three open-sourced products are essentially used together to aggregate, parse, search and visualize logs at an enterprise scale: Logstash: Logstash is primarily used as a log collection tool. It is designed to collect, parse, and store logs originating from multiple sources, such as applications, infrastructure, operating systems, tools, services, and so on. Elasticsearch: With all the logs collected in one place, you now need a query engine to filter and search through these logs for particular events. That's exactly where Elasticsearch comes into play. Elasticsearch is basically a search server based on the popular information retrieval software library, Lucene. It provides a distributed, full-text search engine along with a RESTful web interface for querying your logs. Kibana: Kibana is an open source data visualization plugin, used in conjunction with Elasticsearch. It provides you with the ability to create and export your logs into various visual graphs, such as bar charts, scatter graphs, pie charts, and so on. You can easily download and install each of these components in your AWS environment, and get up and running with your very own ELK stack in a matter of hours! Alternatively, you can also leverage AWS own Elasticsearch service! Amazon Elasticsearch is a managed ELK service that enables you to quickly deploy operate, and scale an ELK stack as per your requirements. Using Amazon Elasticsearch, you eliminate the need for installing and managing the ELK stack's components on your own, which in the long run can be a painful experience. For this particular use case, we will leverage a simple CloudFormation template that will essentially set up an Amazon Elasticsearch domain to filter and visualize the captured CloudTrail Log files, as depicted in the following diagram: To get started, log in to the CloudFormation dashboard, at https://console.aws.amazon.com/cloudformation. Next, select the option Create Stack to bring up the CloudFormation template selector page. Paste http://s3.amazonaws.com/concurrencylabs-cfn-templates/cloudtrail-es-cluster/cloudtrail-es-cluster.json in, the Specify an Amazon S3 template URL field, and click on Next to continue. In the Specify Details page, provide a suitable Stack name and fill out the following required parameters: AllowedIPForEsCluster: Provide the IP address that will have access to the nginx proxy and, in turn, have access to your Elasticsearch cluster. In my case, I've provided my laptop's IP. Note that you can change this IP at a later stage, by visiting the security group of the nginx proxy once it has been created by the CloudFormation template. CloudTrailName: Name of the CloudTrail that we set up at the beginning of this chapter. KeyName: You can select a key-pair for obtaining SSH to your nginx proxy instance: LogGroupName: The name of the CloudWatch Log Group that will act as the input to our Elasticsearch cluster. ProxyInstanceTypeParameter: The EC2 instance type for your proxy instance. Since this is a demonstration, I've opted for the t2.micro instance type. Alternatively, you can select a different instance type as well. Once done, click on Next to continue. Review the settings of your stack and hit Create to complete the process. The stack takes a good few minutes to deploy as a new Elasticsearch domain is created. You can monitor the progress of the deployment by either viewing the CloudFormation's Output tab or, alternatively, by viewing the Elasticsearch dashboard. Note that, for this deployment, a default t2.micro.elasticsearch instance type is selected for deploying Elasticsearch. You should change this value to a larger instance type before deploying the stack for production use. You can view information on Elasticsearch Supported Instance Types at http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-supported-instance-types.html. With the stack deployed successfully, copy the Kibana URL from the CloudFormation Output tab: "KibanaProxyEndpoint": "http://<NGINX_PROXY>/_plugin/kibana/" The Kibana UI may take a few minutes to load. Once it is up and running, you will need to configure a few essential parameters before you can actually proceed. Select Settings and hit the Indices option. Here, fill in the following details: Index contains time-based events: Enable this checkbox to index time-based events Use event times to create index names: Enable this checkbox as well Index pattern interval: Set the Index pattern interval to Daily from the drop-down list Index name of pattern: Type [cwl-]YYYY.MM.DD in to this field Time-field name: Select the @timestamp value from the drop-down list Once completed, hit Create to complete the process. With this, you should now start seeing logs populate on to Kibana's dashboard. Feel free to have a look around and try out the various options and filters provided by Kibana:   Phew! That was definitely a lot to cover! But wait, there's more! AWS provides yet another extremely useful governance and configuration management service AWS Config, know more from this book AWS Administration - The Definitive Guide - Second Edition. The Cloud and the DevOps Revolution Serverless computing wars: AWS Lambdas vs Azure Functions
Read more
  • 0
  • 0
  • 26532

article-image-welcome-new-world
Packt
23 Jan 2017
8 min read
Save for later

Welcome to the New World

Packt
23 Jan 2017
8 min read
We live in very exciting times. Technology is changing at a pace so rapid, that it is becoming near impossible to keep up with these new frontiers as they arrive. And they seem to arrive on a daily basis now. Moore's Law continues to stand, meaning that technology is getting smaller and more powerful at a constant rate. As I said, very exciting. In this article by Jason Odom, the author of the book HoloLens Beginner's Guide, we will be discussing about one of these new emerging technologies that finally is reaching a place more material than science fiction stories, is Augmented or Mixed Reality. Imagine the world where our communication and entertainment devices are worn, and the digital tools we use, as well as the games we play, are holographic projections in the world around us. These holograms know how to interact with our world and change to fit our needs. Microsoft has to lead the charge by releasing such a device... the HoloLens. (For more resources related to this topic, see here.) The Microsoft HoloLens changes the paradigm of what we know as personal computing. We can now have our Word window up on the wall (this is how I am typing right now), we can have research material floating around it, we can have our communication tools like Gmail and Skype in the area as well. We are finally no longer trapped to a virtual desktop, on a screen, sitting on a physical desktop. We aren't even trapped by the confines of a room anymore. What exactly is the HoloLens? The HoloLens is a first of its kind, head-worn standalone computer with a sensor array which includes microphones and multiple types of cameras, spatial sound speaker array, a light projector, and an optical waveguide. The HoloLens is not only a wearable computer; it is also a complete replacement for the standard two-dimensional display. HoloLens has the capability of using holographic projection to create multiple screens throughout and environment as well as fully 3D- rendered objects as well. With the HoloLens sensor array these holograms can fully interact with the environment you are in. The sensor array allows the HoloLens to see the world around it, to see input from the user's hands, as well as for it to hear voice commands. While Microsoft has been very quiet about what the entire sensor array includes we have a good general idea about the components used in the sensor array, let's have a look at them: One IMU: The Inertia Measurement Unit (IMU) is a sensor array that includes, an Accelerometer, a Gyroscope, and a Magnetometer. This unit handles head orientation tracking and compensates for drift that comes from the Gyroscopes eventual lack of precision. Four environment understanding sensors: These together form the spatial mapping that the HoloLens uses to create a mesh of the world around the user. One depth camera:Also known as a structured--light 3D scanner. This device is used for measuring the three-dimensional shape of an object using projected light patterns and a camera system. Microsoft first used this type of camera inside the Kinect for the Xbox 360 and Xbox One.  One ambient light sensor:Ambient light sensors or photosensors are used for ambient light sensing as well as proximity detection. 2 MP photo/HD video camera:For taking pictures and video. Four-microphone array: These do a great job of listening to the user and not the sounds around them. Voice is one of the primary input types with HoloLens. Putting all of these elements together forms a Holographic computer that allows the user to see, hear and interact with the world around in new and unique ways. What you need to develop for the HoloLens The HoloLens development environment breaks down to two primary tools, Unity and Visual Studio. Unity is the 3D environment that we will do most of our work in. This includes adding holograms, creating user interface elements, adding sound, particle systems and other things that bring a 3D program to life. If Unity is the meat on the bone, Visual Studio is a skeleton. Here we write scripts or machine code to make our 3D creations come to life and add a level of control and immersion that Unity can not produce on its own. Unity Unity is a software framework designed to speed up the creation of games and 3D based software. Generally speaking, Unity is known as a game engine, but as the holographic world becomes more apparently, the more we will use such a development environment for many different kinds of applications. Unity is an application that allows us to take 3D models, 2D graphics, particle systems, and sound to make them interact with each other and our user. Many elements are drag and drop, plug and play, what you see is what you get. This can simplify the iteration and testing process. As developers, we most likely do not want to build and compile forever little change we make in the development process. This allows us to see the changes in context to make sure they work, then once we hit a group of changes we can test on the HoloLens ourselves. This does not work for every aspect of HoloLens--Unity development but it does work for a good 80% - 90%. Visual Studio community Microsoft Visual Studio Community is a great free Integrated Development Environment (IDE). Here we use programming languages such as C# or JavaScript to code change in the behavior of objects, and generally, make things happen inside of our programs. HoloToolkit - Unity The HoloToolkit--Unity is a repository of samples, scripts, and components to help speed up the process of development. This covers a large selection of areas in HoloLens Development such as: Input:Gaze, gesture, and voice are the primary ways in which we interact with the HoloLens Sharing:The sharing repository helps allow users to share holographic spaces and connect to each other via the network. Spatial Mapping:This is how the HoloLens sees our world. A large 3D mesh of our space is generated and give our holograms something to interact with or bounce off of. Spatial Sound:The speaker array inside the HoloLens does an amazing work of giving the illusion of space. Objects behind us sound like they are behind us. HoloLens emulator The HoloLens emulator is an extension to Visual Studio that will simulate how a program will run on the HoloLens. This is great for those who want to get started with HoloLens development but do not have an actual HoloLens yet. This software does require the use of Microsoft Hyper-V , a feature only available inside of the Windows 10 Pro operating system. Hyper-V is a virtualization environment, which allows the creation of a virtual machine. This virtual machine emulates the specific hardware so one can test without the actual hardware. Visual Studio tools for Unity This collection of tools adds IntelliSense and debugging features to Visual Studio. If you use Visual Studio and Unity this is a must have: IntelliSense:An intelligent code completion tool for Microsoft Visual Studio. This is designed to speed up many processes when writing code. The version that comes with Visual Studios tools for Unity has unity specific updates. Debugging:Up to the point that this extension exists debugging Unity apps proved to be a little tedious. With this tool, we can now debug Unity applications inside Visual Studio speeding of the bug squashing process considerably. Other useful tools Following mentioned are some the useful tools that are required: Image editor: Photoshop or Gimp are both good examples of programs that allow us to create 2D UI elements and textures for objects in our apps. 3D Modeling Software: 3D Studio Max, Maya, and Blender are all programs that allow us to make 3D objects that can be imported in Unity. Sound Editing Software: There are a few resources for free sounds out of the web with that in mind, Sound Forge is a great tool for editing those sounds, layering sounds together to create new sounds. Summary In this article, we have gotten to know a little bit about the HoloLens, so we can begin our journey into this new world. Here the only limitations are our imaginations. Resources for Article: Further resources on this subject: Creating a Supercomputer [article] Building Voice Technology on IoT Projects [article] C++, SFML, Visual Studio, and Starting the first game [article]
Read more
  • 0
  • 0
  • 26530

article-image-different-ways-running-squid-proxy-server
Packt
24 Feb 2011
10 min read
Save for later

Different Ways of Running Squid Proxy Server

Packt
24 Feb 2011
10 min read
  Squid Proxy Server 3.1: Beginner's Guide Improve the performance of your network using the caching and access control capabilities of Squid Get the most out of your network connection by customizing Squid's access control lists and helpers Set up and configure Squid to get your website working quicker and more efficiently No previous knowledge of Squid or proxy servers is required Part of Packt's Beginner's Guide series: lots of practical, easy-to-follow examples accompanied by screenshots Command line options Normally, all of the Squid configuration options reside with in the squid.conf file (the main Squid configuration file). To tweak the Squid functionality, the preferred method is to change the options in the squid.conf file. However there are some options which can also be controlled using additional command line options while running Squid. These options are not very popular and are rarely used, but these are very useful for debugging problems without the Squid proxy server. Before exploring the command line options, let's see how Squid is run from the command line. The location of the Squid binary file depends on the --prefix option passed to the configure command before compiling. So, depending upon the value of the --prefix option, the location of the Squid executable may be one of /usr/local/sbin/squid or ${prefix}/sbin/squid, where ${prefix} is the value of the option --prefix passed to the configure command. Therefore, to run Squid, we need to run one of the following commands on the terminal: When the --prefix option was not used with the configure command, the default location of the Squid executable will be /usr/local/sbin/squid. When the --prefix option was used and was set to a directory, then the location of the Squid executable will be ${prefix}/sbin/squid. It's painful to type the absolute path for Squid to run. So, to avoid typing the absolute path, we can include the path to the Squid executable in our PATH shell variable, using the export command as shown in the following example: $ export PATH=$PATH:/usr/local/sbin/ Alternatively, we can use the following command: $ export PATH=$PATH:/opt/squid/sbin/ We can also add the preceding command to our ~/.bashrc or ~/.bash_profile file to avoid running the export command every time we enter a new shell. After setting the PATH shell variable appropriately, we can run Squid by simply typing the following command on shell: $ squid This command will run Squid after loading the configuration options from the squid.conf file. We'll be using the squid command without an absolute path for running the Squid process. Please use the appropriate path according to the installation prefix which you have chosen. Now that we know how to run Squid from the command line, let's have a look at the various command line options. Getting a list of available options Before actually moving forward, we should firstly check the available set of options for our Squid installation. Time for action – listing the options Like a lot of other Linux programs, Squid also provides the option -h which can be used to retrieve a list of options: squid -h The previous command will result in the following output: Usage: squid [-cdhvzCFNRVYX] [-s | -l facility] [-f config-file] [-[au] port] [-k signal] -a port Specify HTTP port number (default: 3128). -d level Write debugging to stderr also. -f file Use given config-file instead of /opt/squid/etc/squid.conf. -h Print help message. -k reconfigure|rotate|shutdown|interrupt|kill|debug|check|parse Parse configuration file, then send signal to running copy (except -k parse) and exit. -s | -l facility Enable logging to syslog. -u port Specify ICP port number (default: 3130), disable with 0. -v Print version. -z Create swap directories. -C Do not catch fatal signals. -F Don't serve any requests until store is rebuilt. -N No daemon mode. -R Do not set REUSEADDR on port. -S Double-check swap during rebuild. ... We will now have a look at a few important options from the preceding list. We will also, have a look at the squid(8) man page or http://linux.die.net/man/8/squid for more details. What just happened? We have just used the squid command to list the available options which we can use on the command line. Getting information about our Squid installation Various features may vary across different versions of Squid. Before proceeding any further, it's a good idea to know the version of Squid installed on our machine. Time for action – finding out the Squid version Just in case we want to check which version of Squid we are using or the options we used with the configure command before compiling, we can use the option -v on the command line. Let's run Squid with this option: squid -v If we try to run the preceding command in the terminal, it will produce an output similar to the following: configure options: '--config-cache' '--prefix=/opt/squid/' '--enable-storeio=ufs,aufs' '--enable-removal-policies=lru,heap' '--enable-icmp' '--enable-useragent-log' '--enable-referer-log' '--enable-cache-digests' '--with-large-files' --enable-ltdl-convenience What just happened? We used the squid command with the -v option to find out the version of Squid installed on our machine, and the options used with the configure command before compiling Squid. Creating cache or swap directories The cache directories specified using the cache_dir directive in the squid.conf file, must already exist before Squid can actually use them. Time for action – creating cache directories Squid provides the -z command line option to create the swap directories. Let's see an example: squid -z If this option is used and the cache directories don't exist already, the output will look similar to the following: 2010/07/20 21:48:35| Creating Swap Directories 2010/07/20 21:48:35| Making directories in /squid_cache/00 2010/07/20 21:48:35| Making directories in /squid_cache/01 2010/07/20 21:48:35| Making directories in /squid_cache/02 2010/07/20 21:48:35| Making directories in /squid_cache/03 ... We should use this option whenever we add new cache directories in the Squid configuration file. What just happened? When the squid command is run with the option -z, Squid reads all the cache directories from the configuration file and checks if they already exist. It will then create the directory structure for all the cache directories that don't exist. Have a go hero – adding cache directories Add two or three test cache directories with different values of level 1 (8, 16, and 32) and level 2 (64, 256, and 512) to the configuration file. Then try creating them using the squid command. Now study the difference in the directory structure. Using a different configuration file The default location for Squid's main configuration file is ${prefix}/etc/squid/squid.conf. Whenever we run Squid, the main configuration is read from the default location. While testing or deploying a new configuration, we may want to use a different configuration file just to check whether it will work or not. We can achieve this by using the option -f, which allows us to specify a custom location for the configuration file. Let's see an example: squid -f /etc/squid.minimal.conf # OR squid -f /etc/squid.alternate.conf If Squid is run this way, Squid will try to load the configuration from /etc/squid.minimal.conf or /etc/squid.alternate.conf, and it will completely ignore the squid.conf from the default location. Getting verbose output When we run Squid from the terminal without any additional command line options, only warnings and errors are displayed on the terminal (or stderr). However, while testing, we would like to get a verbose output on the terminal, to see what is happening when Squid starts up. Time for action – debugging output in the console To get more information on the terminal, we can use the option -d. The following is an example: squid -d 2 We must specify an integer with the option -d to indicate the verbosity level. Let's have a look at the meaning of the different levels: Only critical and fatal errors are logged when level 0 (zero) is used. Level 1 includes the logging of important problems. Level 2 and higher includes the logging of informative details and other actions. Higher levels result in more output on the terminal. A sample output on the terminal with level 2 would look similar to the following: 2010/07/20 21:40:53| Starting Squid Cache version 3.1.10 for i686-pc-linux-gnu... 2010/07/20 21:40:53| Process ID 15861 2010/07/20 21:40:53| With 1024 file descriptors available 2010/07/20 21:40:53| Initializing IP Cache... 2010/07/20 21:40:53| DNS Socket created at [::], FD 7 2010/07/20 21:40:53| Adding nameserver 192.168.36.222 from /etc/resolv.conf 2010/07/20 21:40:53| User-Agent logging is disabled. 2010/07/20 21:40:53| Referer logging is disabled. 2010/07/20 21:40:53| Unlinkd pipe opened on FD 13 2010/07/20 21:40:53| Local cache digest enabled; rebuild/rewrite every 3600/3600 sec 2010/07/20 21:40:53| Store logging disabled 2010/07/20 21:40:53| Swap maxSize 0 + 262144 KB, estimated 20164 objects 2010/07/20 21:40:53| Target number of buckets: 1008 2010/07/20 21:40:53| Using 8192 Store buckets 2010/07/20 21:40:53| Max Mem size: 262144 KB 2010/07/20 21:40:53| Max Swap size: 0 KB 2010/07/20 21:40:53| Using Least Load store dir selection 2010/07/20 21:40:53| Current Directory is /opt/squid/sbin 2010/07/20 21:40:53| Loaded Icons. As we can see, Squid is trying to dump a log of actions that it is performing. The messages shown are mostly startup messages and there will be fewer messages when Squid starts accepting connections. Starting Squid in debug mode is quite helpful when Squid is up and running and users complain about poor speeds or being unable to connect. We can have a look at the debugging output and the appropriate actions to take. What just happened? We started Squid in debugging mode and can now see Squid writing an output on the command line, which is basically a log of the actions which Squid is performing. If Squid is not working, we'll be able to see the reasons on the command line and we'll be able to take actions accordingly. Full debugging output on the terminal The option -d specifies the verbosity level of the output dumped by Squid on the terminal. If we require all of the debugging information on the terminal, we can use the option -X, which will force Squid to write debugging information at every single step. If the option -X is used, we'll see information about parsing the squid.conf file and the actions taken by Squid, based on the configuration directives encountered. Let's see a sample output produced when option -X is used: ... 2010/07/21 21:50:51.515| Processing: 'acl my_machines src 172.17.8.175 10.2.44.46 127.0.0.1 172.17.11.68 192.168.1.3' 2010/07/21 21:50:51.515| ACL::Prototype::Registered: invoked for type src 2010/07/21 21:50:51.515| ACL::Prototype::Registered: yes 2010/07/21 21:50:51.515| ACL::FindByName 'my_machines' 2010/07/21 21:50:51.515| ACL::FindByName found no match 2010/07/21 21:50:51.515| aclParseAclLine: Creating ACL 'my_machines' 2010/07/21 21:50:51.515| ACL::Prototype::Factory: cloning an object for type 'src' 2010/07/21 21:50:51.515| aclParseIpData: 172.17.8.175 2010/07/21 21:50:51.515| aclParseIpData: 10.2.44.46 2010/07/21 21:50:51.515| aclParseIpData: 127.0.0.1 2010/07/21 21:50:51.515| aclParseIpData: 172.17.11.68 2010/07/21 21:50:51.515| aclParseIpData: 192.168.1.3 ... Let's see what this output means. In the first line, Squid encountered a line defining an ACL my_machines. The next few lines in the output describe Squid invoking different methods to parse, creating a new ACL, and then assigning values to it. This option can be very helpful while debugging ambiguous ACLs. Running as a normal process Sometime during testing, we may not want Squid to run as a daemon. Instead, we may want it to run as a normal process which we can interrupt easily by pressing CTRL-C. To achieve this, we can use the option -N. When this option is used, Squid will not run in the background it will run in the current shell instead. Parsing the Squid configuration file for errors or warnings It's a good idea to parse or check the configuration file (squid.conf) for any errors or warnings before we actually try to run Squid, or reload a Squid process which is already running in a production deployment. Squid provides an option -k with an argument parse, which, if supplied, will force Squid to parse the current Squid configuration file and report any errors and warnings. Squid -k is also used to check and report directive and option changes when we upgrade our Squid version.  
Read more
  • 0
  • 2
  • 26528

article-image-submitting-malware-word-document
Packt
18 Oct 2013
5 min read
Save for later

Submitting a malware Word document

Packt
18 Oct 2013
5 min read
(For more resources related to this topic, see here.) We will submit a document dealing with Iran's Oil and Nuclear Situation. Perform the following steps: Open a new tab in the terminal and type the following command: $ python utils/submit.py --platform windows –package doc shares/Iran's Oil and Nuclear Situation.doc In this case, the document is located inside the shares folder. You have to change the location based on where your document is. Please make sure you get a Success message like the preceding screenshot with task with ID 7 (it is the ID that depends on how many times you tried to submit a malware). Cuckoo will then start the latest snapshot of the virtual machine we've made. Windows will open the Word document. A warning pop-up window will appear as shown in the preceding screenshot. We assume that the users will not be aware of what that warning is, so we will choose I recognize this content. Allow it to play. option and click on the Continue button. Wait a moment until the malware document takes some action. The VM will close automatically after all the actions are finished by the malware document. Now, you will see the Cuckoo status—on the terminal tab where we started Cuckoo—as shown in the following screenshot: We have now finished the submission process. Let's look at the subfolder of cuckoo, in the storage/analyses/ path. There are some numbered folders in storage/analyses, which represent the analysis task inside the database. These folders are based on the task ID we have created before. So, do not be confused when you find folders other than 7. Just find the folder your were searching for based on the task ID. When you see the reporting folder, you will know that Cuckoo Sandbox will make several files in a dedicated directory. Following is an example of an analysis directory structure: |-- analysis.conf|-- analysis.log|-- binary|-- dump.pcap|-- memory.dmp|-- files| |-- 1234567890| `-- dropped.exe|-- logs| |-- 1232.raw| |-- 1540.raw| `-- 1118.raw|-- reports| |-- report.html| |-- report.json| |-- report.maec11.xml| |-- report.metadata.xml| `-- report.pickle`-- shots|-- 0001.jpg|-- 0002.jpg|-- 0003.jpg`-- 0004.jpg Let us have a look at some of them in detail: analysis.conf: This is a configuration file automatically generated by Cuckoo to instruct its analyzer with some details about the current analysis. It is generally of no interest for the end user, as it is exclusively used internally by the sandbox. analysis.log: This is a log file generated by the analyzer and it contains a trace of the analysis execution inside the guest environment. It will report the creation of processes, files, and eventual error occurred during the execution. binary: This is the binary file we have submitted before. dump.pcap: This is the network dump file generated by tcpdump or any other corresponding network sniffer. memory.dmp: In case you enabled it, this file contains the full memory dump of the analysis machine. files: This directory contains all the files the malware operated on and that Cuckoo was able to dump. logs: This directory contains all the raw logs generated by Cuckoo's process monitoring. reports: This directory contains all the reports generated by Cuckoo. shots: This directory contains all the screenshots of the guest's desktop taken during the malware execution. The contents are not always similar to what is mentioned. They depend on how Cuckoo Sandbox analyzes the malware, what is the kind of the submitted malware and its behavior. After analyzing Iran's Oil and Nuclear Situation.doc there will be four folders, namely, files, logs, reports, and shots, and three files, namely, analysis.log, binary, dump.pcap, inside the storage/analyses/7 folder. To know more about how the final result of the execution of malware inside the Guest OS is, it will be more user-friendly if we open the HTML result located inside the reports folder. There will be a file named report.html. We need to double-click it and open it on the web browser. Another option to see the content of report.html is by using this command: $ lynx report.html There are some tabs with information gathered by Cuckoo Sandbox analyzer in your browser: In the File tab from your browser , you may see some interesting information. We can see this malware has been created by injecting a Word document containing nothing but a macro virus on Wednesday, November 9th, between 03:22 – 03:24 hours. What's more interesting is that it is available in the Network tab under Hosts Involved. Under the Hosts Involved option, there is a list of IP addresses, that is, 192.168.2.101, 192.168.2.255, and 192.168.2.100, which are the Guest OS's IP, Network Broadcast's IP, and vmnet0's IP, respectively. Then, what about the public IP 208.115.230.76? This is the IP used by the malware to contact to the server, which makes the analysis more interesting. After knowing that malware try to make contact outside of the host, you must be wondering how the malware make contact with the server. Therefore, we can look at the contents of the dump.pcap file. To open the dump.pcap file, you should install a packet analyzer. In this article, we will use Wireshark packet analyzer. Please make sure that you have installed Wireshark in your host OS, and then open the dump.pcap file using Wireshark. We can see the network activities of the malware in the preceding screenshot. Summary In this article, you have learned how to submit malware samples to Cuckoo Sandbox. This article also described the example of the submission of malicious files that consist of MS Office Word. Resources for Article: Further resources on this subject: Big Data Analysis [Article] GNU Octave: Data Analysis Examples [Article] StyleCop analysis [Article]
Read more
  • 0
  • 0
  • 26522

article-image-normal-maps
Packt
19 Jan 2017
12 min read
Save for later

Normal maps

Packt
19 Jan 2017
12 min read
In this article by Raimondas Pupius, the author of the book Mastering SFML Game Development we will learn about normal maps and specular maps. (For more resources related to this topic, see here.) Lighting can be used to create visually complex and breath-taking scenes. One of the massive benefits of having a lighting system is the ability it provides to add extra details to your scene, which wouldn't have been possible otherwise. One way of doing so is using normal maps. Mathematically speaking, the word "normal" in the context of a surface is simply a directional vector that is perpendicular to the said surface. Consider the following illustration: In this case, what's normal is facing up because that's the direction perpendicular to the plane. How is this helpful? Well, imagine you have a really complex model with many vertices; it'd be extremely taxing to render the said model because of all the geometry that would need to be processed with each frame. A clever trick to work around this, known as normal mapping, is to take the information of all of those vertices and save them on a texture that looks similar to this one: It probably looks extremely funky, especially if being looked of physical release in grayscale, but try not to think of this in terms of colors, but directions. The red channel of a normal map encodes the –x and +x values. The green channel does the same for –y and +y values, and the blue channel is used for –z to +z. Looking back at the previous image now, it's easier to confirm which direction each individual pixel is facing. Using this information on geometry that's completely flat would still allow us to light it in such a way that it would make it look like it has all of the detail in there; yet, it would still remain flat and light on performance: These normal maps can be hand-drawn or simply generated using software such as Crazybump. Let's see how all of this can be done in our game engine. Implementing normal map rendering In the case of maps, implementing normal map rendering is extremely simple. We already have all the material maps integrated and ready to go, so at this time, it's simply a matter of sampling the texture of the tile-sheet normals: void Map::Redraw(sf::Vector3i l_from, sf::Vector3i l_to) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. auto shader = renderer->GetCurrentShader(); auto textureName = m_tileMap.GetTileSet().GetTextureName(); auto normalMaterial = m_textureManager-> GetResource(textureName + "_normal"); for (auto x = l_from.x; x <= l_to.x; ++x) { for (auto y = l_from.y; y <= l_to.y; ++y) { for (auto layer = l_from.z; layer <= l_to.z; ++layer) { auto tile = m_tileMap.GetTile(x, y, layer); if (!tile) { continue; } auto& sprite = tile->m_properties->m_sprite; sprite.setPosition( static_cast<float>(x * Sheet::Tile_Size), static_cast<float>(y * Sheet::Tile_Size)); // Normal pass. if (normalMaterial) { shader->setUniform("material", *normalMaterial); renderer->Draw(sprite, &m_normals[layer]); } } } } } ... } The process is exactly the same as drawing a normal tile to a diffuse map, except that here we have to provide the material shader with the texture of the tile-sheet normal map. Also note that we're now drawing to a normal buffer texture. The same is true for drawing entities as well: void S_Renderer::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. auto shader = renderer->GetCurrentShader(); auto textures = m_systemManager-> GetEntityManager()->GetTextureManager(); for (auto &entity : m_entities) { auto position = entities->GetComponent<C_Position>( entity, Component::Position); if (position->GetElevation() < l_layer) { continue; } if (position->GetElevation() > l_layer) { break; } C_Drawable* drawable = GetDrawableFromType(entity); if (!drawable) { continue; } if (drawable->GetType() != Component::SpriteSheet) { continue; } auto sheet = static_cast<C_SpriteSheet*>(drawable); auto name = sheet->GetSpriteSheet()->GetTextureName(); auto normals = textures->GetResource(name + "_normal"); // Normal pass. if (normals) { shader->setUniform("material", *normals); drawable->Draw(&l_window, l_materials[MaterialMapType::Normal].get()); } } } ... } You can try obtaining a normal texture through the texture manager. If you find one, you can draw it to the normal map material buffer. Dealing with particles isn't much different from what we've seen already, except for one little piece of detail: void ParticleSystem::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialValuePass")) { // Material pass. auto shader = renderer->GetCurrentShader(); for (size_t i = 0; i < container->m_countAlive; ++i) { if (l_layer >= 0) { if (positions[i].z < l_layer * Sheet::Tile_Size) { continue; } if (positions[i].z >= (l_layer + 1) * Sheet::Tile_Size) { continue; } } else if (positions[i].z < Sheet::Num_Layers * Sheet::Tile_Size) { continue; } // Normal pass. shader->setUniform("material", sf::Glsl::Vec3(0.5f, 0.5f, 1.f)); renderer->Draw(drawables[i], l_materials[MaterialMapType::Normal].get()); } } ... } As you can see, we're actually using the material value shader in order to give particles' static normals, which are always sort of pointing to the camera. A normal map buffer should look something like this after you render all the normal maps to it: Changing the lighting shader Now that we have all of this information, let's actually use it when calculating the illumination of the pixels inside the light pass shader: uniform sampler2D LastPass; uniform sampler2D DiffuseMap; uniform sampler2D NormalMap; uniform vec3 AmbientLight; uniform int LightCount; uniform int PassNumber; struct LightInfo { vec3 position; vec3 color; float radius; float falloff; }; const int MaxLights = 4; uniform LightInfo Lights[MaxLights]; void main() { vec4 pixel = texture2D(LastPass, gl_TexCoord[0].xy); vec4 diffusepixel = texture2D(DiffuseMap, gl_TexCoord[0].xy); vec4 normalpixel = texture2D(NormalMap, gl_TexCoord[0].xy); vec3 PixelCoordinates = vec3(gl_FragCoord.x, gl_FragCoord.y, gl_FragCoord.z); vec4 finalPixel = gl_Color * pixel; vec3 viewDirection = vec3(0, 0, 1); if(PassNumber == 1) { finalPixel *= vec4(AmbientLight, 1.0); } // IF FIRST PASS ONLY! vec3 N = normalize(normalpixel.rgb * 2.0 - 1.0); for(int i = 0; i < LightCount; ++i) { vec3 L = Lights[i].position - PixelCoordinates; float distance = length(L); float d = max(distance - Lights[i].radius, 0); L /= distance; float attenuation = 1 / pow(d/Lights[i].radius + 1, 2); attenuation = (attenuation - Lights[i].falloff) / (1 - Lights[i].falloff); attenuation = max(attenuation, 0); float normalDot = max(dot(N, L), 0.0); finalPixel += (diffusepixel * ((vec4(Lights[i].color, 1.0) * attenuation))) * normalDot; } gl_FragColor = finalPixel; } First, the normal map texture needs to be passed to it as well as sampled, which is where the first two highlighted lines of code come in. Once this is done, for each light we're drawing on the screen, the normal directional vector is calculated. This is done by first making sure that it can go into the negative range and then normalizing it. A normalized vector only represents a direction. Since the color values range from 0 to 255, negative values cannot be directly represented. This is why we first bring them into the right range by multiplying them by 2.0 and subtracting by 1.0. A dot product is then calculated between the normal vector and the normalized L vector, which now represents the direction from the light to the pixel. How much a pixel is lit up from a specific light is directly contingent upon the dot product, which is a value from 1.0 to 0.0 and represents magnitude. A dot product is an algebraic operation that takes in two vectors, as well as the cosine of the angle between them, and produces a scalar value between 0.0 and 1.0 that essentially represents how “orthogonal” they are. We use this property to light pixels less and less, given greater and greater angles between their normals and the light. Finally, the dot product is used again when calculating the final pixel value. The entire influence of the light is multiplied by it, which allows every pixel to be drawn differently as if it had some underlying geometry that was pointing in a different direction. The last thing left to do now is to pass the normal map buffer to the shader in our C++ code: void LightManager::RenderScene() { ... if (renderer->UseShader("LightPass")) { // Light pass. ... shader->setUniform("NormalMap", m_materialMaps[MaterialMapType::Normal]->getTexture()); ... } ... } This effectively enables normal mapping and gives us beautiful results such as this: The leaves, the character, and pretty much everything in this image now looks like they have a definition, ridges, and crevices; it is lit as if it had geometry, although it's paper-thin. Note the lines around each tile in this particular instance. This is one of the main reasons why normal maps for pixel art, such as tile sheets, shouldn't be automatically generated; it can sample the tiles adjacent to it and incorrectly add bevelled edges. Specular maps While normal maps provide us with the possibility to fake how bumpy a surface is, specular maps allow us to do the same with the shininess of a surface. This is what the same segment of the tile sheet we used as an example for a normal map looks like in a specular map: It's not as complex as a normal map since it only needs to store one value: the shininess factor. We can leave it up to each light to decide how much shine it will cast upon the scenery by letting it have its own values: struct LightBase { ... float m_specularExponent = 10.f; float m_specularStrength = 1.f; }; Adding support for specularity Similar to normal maps, we need to use the material pass shader to render to a specularity buffer texture: void Map::Redraw(sf::Vector3i l_from, sf::Vector3i l_to) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. ... auto specMaterial = m_textureManager->GetResource( textureName + "_specular"); for (auto x = l_from.x; x <= l_to.x; ++x) { for (auto y = l_from.y; y <= l_to.y; ++y) { for (auto layer = l_from.z; layer <= l_to.z; ++layer) { ... // Normal pass. // Specular pass. if (specMaterial) { shader->setUniform("material", *specMaterial); renderer->Draw(sprite, &m_speculars[layer]); } } } } } ... } The texture for specularity is once again attempted to be obtained; it is passed down to the material pass shader if found. The same is true when you render entities: void S_Renderer::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialPass")) { // Material pass. ... for (auto &entity : m_entities) { ... // Normal pass. // Specular pass. if (specular) { shader->setUniform("material", *specular); drawable->Draw(&l_window, l_materials[MaterialMapType::Specular].get()); } } } ... } Particles, on the other hand, also use the material value pass shader: void ParticleSystem::Draw(MaterialMapContainer& l_materials, Window& l_window, int l_layer) { ... if (renderer->UseShader("MaterialValuePass")) { // Material pass. auto shader = renderer->GetCurrentShader(); for (size_t i = 0; i < container->m_countAlive; ++i) { ... // Normal pass. // Specular pass. shader->setUniform("material", sf::Glsl::Vec3(0.f, 0.f, 0.f)); renderer->Draw(drawables[i], l_materials[MaterialMapType::Specular].get()); } } } For now, we don't want any of them to be specular at all. This can obviously be tweaked later on, but the important thing is that we have that functionality available and yielding results, such as the following: This specularity texture needs to be sampled inside a light-pass shader, just like a normal texture. Let's see what this involves. Changing the lighting shader Just as before, a uniform sampler2D needs to be added to sample the specularity of a particular fragment: uniform sampler2D LastPass; uniform sampler2D DiffuseMap; uniform sampler2D NormalMap; uniform sampler2D SpecularMap; uniform vec3 AmbientLight; uniform int LightCount; uniform int PassNumber; struct LightInfo { vec3 position; vec3 color; float radius; float falloff; float specularExponent; float specularStrength; }; const int MaxLights = 4; uniform LightInfo Lights[MaxLights]; const float SpecularConstant = 0.4; void main() { ... vec4 specularpixel = texture2D(SpecularMap, gl_TexCoord[0].xy); vec3 viewDirection = vec3(0, 0, 1); // Looking at positive Z. ... for(int i = 0; i < LightCount; ++i){ ... float specularLevel = 0.0; specularLevel = pow(max(0.0, dot(reflect(-L, N), viewDirection)), Lights[i].specularExponent * specularpixel.a) * SpecularConstant; vec3 specularReflection = Lights[i].color * specularLevel * specularpixel.rgb * Lights[i].specularStrength; finalPixel += (diffusepixel * ((vec4(Lights[i].color, 1.0) * attenuation)) + vec4(specularReflection, 1.0)) * normalDot; } gl_FragColor = finalPixel; } We also need to add in the specular exponent and strength to each light's struct, as it's now part of it. Once the specular pixel is sampled, we need to set up the direction of the camera as well. Since that's static, we can leave it as is in the shader. The specularity of the pixel is then calculated by taking into account the dot product between the pixel’s normal and the light, the color of the specular pixel itself, and the specular strength of the light. Note the use of a specular constant in the calculation. This is a value that can and should be tweaked in order to obtain best results, as 100% specularity rarely ever looks good. Then, all that's left is to make sure the specularity texture is also sent to the light-pass shader in addition to the light's specular exponent and strength values: void LightManager::RenderScene() { ... if (renderer->UseShader("LightPass")) { // Light pass. ... shader->setUniform("SpecularMap", m_materialMaps[MaterialMapType::Specular]->getTexture()); ... for (auto& light : m_lights) { ... shader->setUniform(id + ".specularExponent", light.m_specularExponent); shader->setUniform(id + ".specularStrength", light.m_specularStrength); ... } } } The result may not be visible right away, but upon closer inspection of moving a light stream, we can see that correctly mapped surfaces will have a glint that will move around with the light: While this is nearly perfect, there's still some room for improvement. Summary Lighting is a very powerful tool when used right. Different aspects of a material may be emphasized depending on the setup of the game level, additional levels of detail can be added in without too much overhead, and the overall aesthetics of the project will be leveraged to new heights. The full version of “Mastering SFML Game Development” offers all of this and more by not only utilizing normal and specular maps, but also using 3D shadow-mapping techniques to create Omni-directional point light shadows that breathe new life into the game world. Resources for Article: Further resources on this subject: Common Game Programming Patterns [article] Sprites in Action [article] Warfare Unleashed Implementing Gameplay [article]
Read more
  • 0
  • 0
  • 26521

article-image-what-can-you-expect-at-neurips-2019
Sugandha Lahoti
06 Sep 2019
5 min read
Save for later

What can you expect at NeurIPS 2019?

Sugandha Lahoti
06 Sep 2019
5 min read
Popular machine learning conference NeurIPS 2019 (Conference on Neural Information Processing Systems) will be held on Sunday, December 8 through Saturday, December 14 at the Vancouver Convention Center. The conference invites papers tutorials, and submissions on cross-disciplinary research where machine learning methods are being used in other fields, as well as methods and ideas from other fields being applied to ML.  NeurIPS 2019 accepted papers Yesterday, the conference published the list of their accepted papers. A total of 1429 papers have been selected. Submissions opened from May 1 on a variety of topics such as Algorithms, Applications, Data implementations, Neuroscience, and Cognitive Science, Optimization, Probabilistic Methods, Reinforcement Learning and Planning, and Theory. (The full list of Subject Areas are available here.) This year at NeurIPS 2019, authors of accepted submissions were mandatorily required to prepare either a 3-minute video or a PDF of slides summarizing the paper or prepare a PDF of the poster used at the conference. This was done to make NeurIPS content accessible to those unable to attend the conference. NeurIPS 2019 also introduced a mandatory abstract submission deadline, a week before final submissions are due. Only a submission with a full abstract was allowed to have the full paper uploaded. The authors were also asked to answer questions from the Reproducibility Checklist during the submission process. NuerIPS 2019 tutorial program NeurIPS also invites experts to present tutorials that feature topics that are of interest to a sizable portion of the NeurIPS community and are different from the ones already presented at other ML conferences like ICML or ICLR. They looked for tutorial speakers that cover topics beyond their own research in a comprehensive manner that encompasses multiple perspectives.  The tutorial chairs for NeurIPS 2019 are Danielle Belgrave and Alice Oh. They initially compiled a list based on the last few years’ publications, workshops, and tutorials presented at NeurIPS and at related venues. They asked colleagues for recommendations and conducted independent research. In reviewing the potential candidates, the chair read papers to understand their expertise and watch their videos to appreciate their style of delivery. The list of candidates was emailed to the General Chair, Diversity & Inclusion Chairs, and the rest of the Organizing Committee for their comments on this shortlist. Following a few adjustments based on their input, the potential speakers were selected. A total of 9 tutorials have been selected for NeurIPS 2019: Deep Learning with Bayesian Principles - Emtiyaz Khan Efficient Processing of Deep Neural Network: from Algorithms to Hardware Architectures - Vivienne Sze Human Behavior Modeling with Machine Learning: Opportunities and Challenges - Nuria Oliver, Albert Ali Salah Interpretable Comparison of Distributions and Models - Wittawat Jitkrittum, Dougal Sutherland, Arthur Gretton Language Generation: Neural Modeling and Imitation Learning -  Kyunghyun Cho, Hal Daume III Machine Learning for Computational Biology and Health - Anna Goldenberg, Barbara Engelhardt Reinforcement Learning: Past, Present, and Future Perspectives - Katja Hofmann Representation Learning and Fairness - Moustapha Cisse, Sanmi Koyejo Synthetic Control - Alberto Abadie, Vishal Misra, Devavrat Shah NeurIPS 2019 Workshops NeurIPS Workshops are primarily used for discussion of work in progress and future directions. This time the number of Workshop Chairs doubled, from two to four; selected chairs are Jenn Wortman Vaughan, Marzyeh Ghassemi, Shakir Mohamed, and Bob Williamson. However, the number of workshop submissions went down from 140 in 2018 to 111 in 2019. Of these 111 submissions, 51 workshops were selected. The full list of selected Workshops is available here.  The NeurIPS 2019 chair committee introduced new guidelines, expectations, and selection criteria for the Workshops. This time workshops had an important focus on the nature of the problem, intellectual excitement of the topic, diversity, and inclusion, quality of proposed invited speakers, organizational experience and ability of the team and more.  The Workshop Program Committee consisted of 37 reviewers with each workshop proposal assigned to two reviewers. The reviewer committee included more senior researchers who have been involved with the NeurIPS community. Reviewers were asked to provide a summary and overall rating for each workshop, a detailed list of pros and cons, and specific ratings for each of the new criteria. After all reviews were submitted, each proposal was assigned to two of the four chair committee members. The chair members looked through assigned proposals and their reviews to form an educated assessment of the pros and cons of each. Finally, the entire chair held a meeting to discuss every submitted proposal to make decisions.  You can check more details about the conference on the NeurIPS website. As always keep checking this space for more content about the conference. In the meanwhile, you can read our previous year coverage: NeurIPS Invited Talk: Reproducible, Reusable, and Robust Reinforcement Learning NeurIPS 2018: How machine learning experts can work with policymakers to make good tech decisions [Invited Talk] NeurIPS 2018: Rethinking transparency and accountability in machine learning NeurIPS 2018: Developments in machine learning through the lens of Counterfactual Inference [Tutorial] Accountability and algorithmic bias: Why diversity and inclusion matters [NeurIPS Invited Talk]
Read more
  • 0
  • 0
  • 26494
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-implement-memory-oltp-sql-server-linux
Fatema Patrawala
17 Feb 2018
11 min read
Save for later

How to implement In-Memory OLTP on SQL Server in Linux

Fatema Patrawala
17 Feb 2018
11 min read
[box type="note" align="" class="" width=""]Below given article is an excerpt from the book SQL Server on Linux, authored by Jasmin Azemović. This book is a handy guide to setting up and implementing your SQL Server solution on the open source Linux platform.[/box] Today we will learn about the basics of In-Memory OLTP and how to implement it on SQL Server on Linux through the following topics: Elements of performance What is In-Memory OLTP Implementation Elements of performance How do you know if you have a performance issue in your database environment? Well, let's put it in these terms. You notice it (the good), users start calling technical support and complaining about how everything is slow (the bad) or you don't know about your performance issues (the ugly). Try to never get in to the last category. The good Achieving best performance is an iterative process where you need to define a set of tasks that you will execute on a regular basics and monitor their results. Here is a list that will give you an idea and guide you through this process: Establish the baseline Define the problem Fix one thing at a time Test and re-establish the baseline Repeat everything Establishing the baseline is the critical part. In most case scenarios, it is not possible without real stress testing. Example: How many users' systems can you handle on the current configuration? The next step is to measure the processing time. Do your queries or stored procedures require milliseconds, seconds, or minutes to execute? Now you need to monitor your database server using a set of tools and correct methodologies. During that process, you notice that some queries show elements of performance degradation. This is the point that defines the problem. Let's say that frequent UPDATE and DELETE operations are resulting in index fragmentation. The next step is to fix this issue with REORGANIZE or REBUILD index operations. Test your solution in the control environment and then in the production. Results can be better, same, or worse. It depends and there is no magic answer here. Maybe now something else is creating the problem: disk, memory, CPU, network, and so on. In this step, you should re-establish the old or a new baseline. Measuring performance process is something that never ends. You should keep monitoring the system and stay alert. The bad If you are in this category, then you probably have an issue with establishing the baseline and alerting the system. So, users are becoming your alerts and that is a bad thing. The rest of the steps are the same except re-establishing the baseline. But this can be your wake-up call to move yourself in the good category. The ugly This means that you don't know or you don't want to know about performance issues. The best case scenario is a headline on some news portal, but that is the ugly thing. Every decent DBA should try to be light years away from this category. What do you need to start working with performance measuring, monitoring, and fixing? Here are some tips that can help you: Know the data and the app Know your server and its capacity Use dynamic management views—DMVs: Sys.dm_os_wait_stats Sys.dm_exec_query_stats sys.dm_db_index_operational_stats Look for top queries by reads, writes, CPU, execution count Put everything in to LibreOffice Calc or another spreadsheet application and do some basic comparative math Fortunately, there is something in the field that can make your life really easy. It can boost your environment to the scale of warp speed (I am a Star Trek fan). What is In-Memory OLTP? SQL Server In-Memory feature is unique in the database world. The reason is very simple; because it is built-in to the databases' engine itself. It is not a separate database solution and there are some major benefits of this. One of these benefits is that in most cases you don't have to rewrite entire SQL Server applications to see performance benefits. On average, you will see 10x more speed while you are testing the new In-Memory capabilities. Sometimes you will even see up to 50x improvement, but it all depends on the amount of business logic that is done in the database via stored procedures. The greater the logic in the database, the greater the performance increase. The more the business logic sits in the app, the less opportunity there is for performance increase. This is one of the reasons for always separating database world from the rest of the application layer. It has built-in compatibility with other non-memory tables. This way you can optimize thememory you have for the most heavily used tables and leave others on the disk. This also means you won't have to go out and buy expensive new hardware to make large InMemory databases work; you can optimize In-Memory to fit your existing hardware. In-Memory was started in SQL Server 2014. One of the first companies that has started to use this feature during the development of the 2014 version was Bwin. This is an online gaming company. With In-Memory OLTP they improved their transaction speed by 16x, without investing in new expensive hardware. The same company has achieved 1.2 Million requests/second on SQL Server 2016 with a single machine using In-Memory OLTP: https://blogs.msdn.microsoft.com/sqlcat/2016/10/26/how-bwin-is-using-sql-server-2016-in-memory-oltp-to-achieve-unprecedented-performance-and-scale/ Not every application will benefit from In-Memory OLTP. If an application is not suffering from performance problems related to concurrency, IO pressure, or blocking, it's probably not a good candidate. If the application has long-running transactions that consume large amounts of buffer space, such as ETL processing, it's probably not a good candidate either. The best applications for consideration would be those that run high volumes of small fast transactions, with repeatable query plans such as order processing, reservation systems, stock trading, and ticket processing. The biggest benefits will be seen on systems that suffer performance penalties from tables that are having concurrency issues related to a large number of users and locking/blocking. Applications that heavily use the tempdb for temporary tables could benefit from In-Memory OLTP by creating the table as memory optimized, and performing the expensive sorts, and groups, and selective queries on the tables that are memory optimized. In-Memory OLTP quick start An important thing to remember is that the databases that will contain memory-optimized tables must have a MEMORY_OPTIMIZED_DATA filegroup. This filegroup is used for storing the checkpoint needed by SQL Server to recover the memory-optimized tables. Here is a simple DDL SQL statement to create a database that is prepared for In-Memory tables: 1> USE master 2> GO 1> CREATE DATABASE InMemorySandbox 2> ON 3> PRIMARY (NAME = InMemorySandbox_data, 4> FILENAME = 5> '/var/opt/mssql/data/InMemorySandbox_data_data.mdf', 6> size=500MB), 7> FILEGROUP InMemorySandbox_fg 8> CONTAINS MEMORY_OPTIMIZED_DATA 9> (NAME = InMemorySandbox_dir, 10> FILENAME = 11> '/var/opt/mssql/data/InMemorySandbox_dir') 12> LOG ON (name = InMemorySandbox_log, 13> Filename= 14>'/var/opt/mssql/data/InMemorySandbox_data_data.ldf', 15> size=500MB) 16 GO   The next step is to alter the existing database and configure it to access memory-optimized tables. This part is helpful when you need to test and/or migrate current business solutions: --First, we need to check compatibility level of database. -- Minimum is 130 1> USE AdventureWorks 2> GO 3> SELECT T.compatibility_level 4> FROM sys.databases as T 5> WHERE T.name = Db_Name(); 6> GO compatibility_level ------------------- 120 (1 row(s) affected) --Change the compatibility level 1> ALTER DATABASE CURRENT 2> SET COMPATIBILITY_LEVEL = 130; 3> GO --Modify the transaction isolation level 1> ALTER DATABASE CURRENT SET 2> MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON 3> GO --Finlay create memory optimized filegroup 1> ALTER DATABASE AdventureWorks 2> ADD FILEGROUP AdventureWorks_fg CONTAINS 3> MEMORY_OPTIMIZED_DATA 4> GO 1> ALTER DATABASE AdventureWorks ADD FILE 2> (NAME='AdventureWorks_mem', 3> FILENAME='/var/opt/mssql/data/AdventureWorks_mem') 4> TO FILEGROUP AdventureWorks_fg 5> GO   How to create memory-optimized table? The syntax for creating memory-optimized tables is almost the same as the syntax for creating classic disk-based tables. You will need to specify that the table is a memory-optimized table, which is done using the MEMORY_OPTIMIZED = ON clause. A memory-optimized table can be created with two DURABILITY values: SCHEMA_AND_DATA (default) SCHEMA_ONLY If you defined a memory-optimized table with DURABILITY=SCHEMA_ONLY, it means that changes to the table's data are not logged and the data is not persisted on disk. However, the schema is persisted as part of the database metadata. A side effect is that an empty table will be available after the database is recovered during a restart of SQL Server on Linux service.   The following table is a summary of key differences between those two DURABILITY Options. When you create a memory-optimized table, the database engine will generate DML routines just for accessing that table, and load them as DLLs files. SQL Server itself does not perform data manipulation, instead it calls the appropriate DLL: Now let's add some memory-optimized tables to our sample database:     1> USE InMemorySandbox 2> GO -- Create a durable memory-optimized table 1> CREATE TABLE Basket( 2> BasketID INT IDENTITY(1,1) 3> PRIMARY KEY NONCLUSTERED, 4> UserID INT NOT NULL INDEX ix_UserID 5> NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000), 6> CreatedDate DATETIME2 NOT NULL,   7> TotalPrice MONEY) WITH (MEMORY_OPTIMIZED=ON) 8> GO -- Create a non-durable table. 1> CREATE TABLE UserLogs ( 2> SessionID INT IDENTITY(1,1) 3> PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=400000), 4> UserID int NOT NULL, 5> CreatedDate DATETIME2 NOT NULL, 6> BasketID INT, 7> INDEX ix_UserID 8> NONCLUSTERED HASH (UserID) WITH (BUCKET_COUNT=400000)) 9> WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY) 10> GO -- Add some sample records 1> INSERT INTO UserLogs VALUES 2> (432, SYSDATETIME(), 1), 3> (231, SYSDATETIME(), 7), 4> (256, SYSDATETIME(), 7), 5> (134, SYSDATETIME(), NULL), 6> (858, SYSDATETIME(), 2), 7> (965, SYSDATETIME(), NULL) 8> GO 1> INSERT INTO Basket VALUES 2> (231, SYSDATETIME(), 536), 3> (256, SYSDATETIME(), 6547), 4> (432, SYSDATETIME(), 23.6), 5> (134, SYSDATETIME(), NULL) 6> GO -- Checking the content of the tables 1> SELECT SessionID, UserID, BasketID 2> FROM UserLogs 3> GO 1> SELECT BasketID, UserID 2> FROM Basket 3> GO   What is natively compiled stored procedure? This is another great feature that comes comes within In-Memory package. In a nutshell, it is a classic SQL stored procedure, but it is compiled into machine code for blazing fast performance. They are stored as native DLLs, enabling faster data access and more efficient query execution than traditional T-SQL. Now you will create a natively compiled stored procedure to insert 1,000,000 rows into Basket: 1> USE InMemorySandbox 2> GO 1> CREATE PROCEDURE dbo.usp_BasketInsert @InsertCount int 2> WITH NATIVE_COMPILATION, SCHEMABINDING AS 3> BEGIN ATOMIC 4> WITH 5> (TRANSACTION ISOLATION LEVEL = SNAPSHOT, 6> LANGUAGE = N'us_english') 7> DECLARE @i int = 0 8> WHILE @i < @InsertCount 9> BEGIN 10> INSERT INTO dbo.Basket VALUES (1, SYSDATETIME() , NULL) 11> SET @i += 1 12> END 13> END 14> GO --Add 1000000 records 1> EXEC dbo.usp_BasketInsert 1000000 2> GO   The insert part should be blazing fast. Again, it depends on your environment (CPU, RAM, disk, and virtualization). My insert was done in less than three seconds, on an average machine. But significant improvement should be visible now. Execute the following SELECT statement and count the number of records:   1> SELECT COUNT(*) 2> FROM dbo.Basket 3> GO ----------- 1000004 (1 row(s) affected)   In my case, counting of one million records was less than one second. It is really hard to achieve this performance on any kind of disk. Let's try another query. We want to know how much time it will take to find the top 10 records where the insert time was longer than 10 microseconds:   1> SELECT TOP 10 BasketID, CreatedDate 2> FROM dbo.Basket 3> WHERE DATEDIFF 4> (MICROSECOND,'2017-05-30 15:17:20.9308732', CreatedDate) 5> >10 6> GO   Again, query execution time was less than a second. Even if you remove TOP and try to get all the records it will take less than a second (in my case scenario). Advantages of InMemory tables are more than obvious.   We learnt about the basic concepts of In-Memory OLTP and how to implement it on new and existing database. We also got to know that a memory-optimized table can be created with two DURABILITY values and finally, we created an In-Memory table. If you found this article useful, check out the book SQL Server on Linux, which covers advanced SQL Server topics, demonstrating the process of setting up SQL Server database solution in the Linux environment.        
Read more
  • 0
  • 0
  • 26486

article-image-sprites-action
Packt
20 Feb 2015
6 min read
Save for later

Sprites in Action

Packt
20 Feb 2015
6 min read
In this article by Milcho G. Milchev, author of the book SFML Essentials, we will see how we can use SFML to create a customized animation using a sequence of images. We will also see how SFML renders an animation. Animation exists in many forms. The traditional approach to animation is drawing a sequence of images which differ slightly from each other, and showing them on a screen one after the other. Even though this approach is still widely used, there are more elegant alternatives. For example, drawing (or modelling in 3D) only the limbs of a character and then animating how they move relative to time is a technique that saves a lot of time for artists. It also creates smoother results because not every frame of the animation has to be redrawn. In this book, we are going to explore only the traditional approach, since it is the simpler solution for programmers, and in many cases it is enough to bring life to any sprite. (For more resources related to this topic, see here.) The setup As we established earlier, the traditional approach involves a set of images that need to change over time. For our example, we will use a crystal, which rotates around its centre. Typically, an animation is kept in a single file (a sprite sheet), where each frame of the animation is stored, and in most cases, each frame is the same size—the size of the object. In our example, the sprite is, 32 x 32 pixels and has eight frames, which play for one second. Here is what the sprite sheet looks like: The following screenshot shows our animation setup in code: First of all, note that we are using the AssetManager class to load our sprite sheet. The next line sets the texture rectangle of the sprite to target the first image in our sprite sheet. Here is what this means in terms of the sprite sheet texture: Next, we will move this texture rectangle once in a while to simulate a rotating crystal. In the previous code, we set the number of frames to eight (as many as there are in  the sprite sheet), and set the time of the animation to one second in total, which means that each frame stays for about 0.125 seconds (the animation duration is divided by the number of frames) at a time. We know what we want to do now, so let's do it: In the code, we first measure the delta time since the last frame and add it to the accumulated time. The last two lines of the code actually do all the work. The first one looks intimidating at first glance, but it is simply a way to choose the correct frame, based on how much time has passed and how long the animation is. The formula timeAsSeconds / animationDuration gives us the time relative to the animation duration. So let's say that 0.4 seconds have passed and our animation duration is 1 second. This leaves us with 0.4 seconds in local animation time. Multiply this 0.4 seconds by the number of frames, and we get the following result: 0.4 * 8 = 3.2 This gives us which frame we should be on at the moment, and how long we have been there. The current frame index is the whole part of 3.2 (which is three), and the fraction part (0.2) is how long we have been on that frame. In this case, we are only interested in the current frame so we will take that by casting the whole expression to int. This rounds the number down if the number is positive (which it always is in this case). The last part, % frameNum is there to restart the animation when it reaches beyond its last frame. So in the case where 2.3 seconds have passed, we have the following result: 2.3 * 8 = 18.4 We do not have a 19th frame to show, so we show the frame which corresponds to that in our local scale [0…7]. In this case: 18 / 8 = 2 (and 2 remainder) Since the % operator takes the remainder of a division, we are left to show the frame with the index two, which is the third frame. (We start counting from zero as programmers, remember?) The last line of the code sets the texture rectangle to the current frame. The process is quite straightforward—since we only have frames on the x axis, we do not need to worry about the y coordinate of the rectangle, and so we will set it to zero. The x is computed by animFrame * spriteSize.x, which multiplies the current frame by the width of the frame. In the case, the current frame is two and the frame's width is 32, so we get: 2 * 32 = 64 Here is what the texture rectangle will look like: The last thing we need to do is render the sprite inside the render frame and we are done. If everything goes smoothly, we should have a rotating crystal on the screen with eight frames. With this technique, we can animate sprites of all kinds no matter how many frames they have or how long the animation is. There are problems with the current approach though - the code looks messy, and it is only useful for a single animation. What if we want multiple animations for a sprite (rotating the crystal in a vertical direction as well), and we want to be able to switch between them? Currently, we would have to duplicate all our code for each animation and each animated sprite. In the next section, we will talk about how to avoid these issues by building a fully featured animation system that requires as little code duplication as possible. Summary Sprite animations seem quite easy now, don't they? Just keep in mind that there is a lot more to explore when it comes to animation. Not only are there different techniques to doing them, but also perfecting what we've developed so far might take some time. Fortunately, what we have so far will work as is in the majority of cases so I would say that you are pretty much set to go. If you want to dig deeper, buy the book and read SFML Essentials in a simple step-by-step fashion by using SFML library to create realistic looking animations as well as to develop  2D and 3D games using the SFML. Resources for Article: Further resources on this subject: Vmware Vcenter Operations Manager Essentials - Introduction To Vcenter Operations Manager [article] Translating a File In Sdl Trados Studio [article] Adding Finesse To Your Game [article]
Read more
  • 0
  • 0
  • 26484

article-image-vue-maintainers-proposed-listened-and-revised-the-rfc-for-hooks-in-vue-api
Bhagyashree R
28 Jun 2019
6 min read
Save for later

Vue maintainers proposed, listened, and revised the RFC for hooks in Vue API

Bhagyashree R
28 Jun 2019
6 min read
The internet was ablaze when Evan You, creator of Vue, published an RFC to introduce a function-based component API earlier this month. This followed a huge discussion in the Vue community on whether such an API is really needed. https://twitter.com/youyuxi/status/1137567675356291072 This proposal came after Evan You previewed an experimental Hooks API back in November at Vue Conf Toronto 2018. Why Vue needs a function-based component API Components help you to abstract your code into smaller pieces. This gives your web applications a better structure, makes the code more readable and understandable and most importantly enables you to reuse logic across multiple components. According to the RFC, the components API in Vue 2.x has some drawbacks in terms of reusability. The three common patterns that are generally used to achieve reusability in Vue are mixins, high-order components, and renderless components. Each of these come with their share of drawbacks: Mixins bring implicit dependencies in code, causes name clashes, and make your code harder to understand. HOCs can often be verbose, involve lots of passing props and hoisting statics, and can cause name conflicts. Renderless components require extra stateful component instances that come at the cost of performance. This function-based component API aims to address all these drawbacks. Inspired by React Hooks, its objective is to provide developers a “clean and flexible way” to compose logic and share it between components. The team plans to achieve this by moving the logic code to a "composition function" and returning reactive state. Another motivation behind this proposed change is to provide better built-in TypeScript type inference support as function-based APIs are naturally type-friendly. Also, code written with function-based APIs compresses better than an object or class-based code. What Vue developers think about this RFC? The Vue community was a little taken aback with this proposal that will essentially change the way they used to write Vue. They were concerned that this will take away the most desirable property of Vue, which is its simplicity. Vue’s class-based API made it easy to understand and get started with. However, bringing function-based API to Vue will complex things in exchange for very fewer advantages. Some argued that this change will make it another React. “Like a lot of others here, I chose Vue vs React for the simplicity and readability of code. The class-based API was easy to understand and pick up. If I wanted React, I would have just chosen React from the beginning. I get that there are some technical advantages to doing this, but Vue 3 is starting to really turn me off of staying with Vue going forward,” a developer shared on a Reddit thread. Developers were concerned that the time they have invested in learning Vue will go to waste as everything is about to change. A Vue developer commented on Reddit, “You learn to do something one way and then they change it up on you. Might as well just switch to react at this point.” Many compared this scenario to that of Angular 1->2 or Python 2->3 and suggested switching to Svelte to avoid the mess. Some, however, liked the syntax and are looking forward to playing around with the API.  A developer shared, “But I read through, checked out the new (simpler) example, read Evan's arguments about logical task grouping, and on a second read with a more open mind, I actually kind of like the new syntax and am now looking forward to trying it out. I'm glad they agreed to keep the object syntax around though.” How the Vue team responded When the RFC was first published it implied that the current API will be deprecated in a future major release. Also, there was a lot of confusion around the "compatibility" and "stable" build.  Many developers felt that this RFC is already “set in stone” from the way it was communicated. They felt that the core team has already decided to bring this API to Vue without community consultation. So, one of the reasons behind this confusion was how the change was communicated. The team acknowledged this and asked for suggestions from the community to improve their communication. https://twitter.com/N_Tepluhina/status/1142715703558103040 The core team clarified that the update will be additive and the team has no plans to remove the Object API in a future major release. Evan You, the creator of Vue, said in a thread, “feel free to stay with the current API for as long as you wish. As long as the community feels there's a need for the old API to stay, it will stay. The only one that can make the decision to switch to the new API is yourself.” He also addressed the concerns on a Hacker News thread: There is a lot of FUD in this thread so we need to clarify a bit: - This API is purely additive to 2.x and doesn't break anything. - 3.0 will have a standard build which adds this API on top of 2.x API, and an opt-in "lean build" which drops a number of 2.x APIs for a smaller and faster runtime. - This is an open RFC, which means it's not set in stone. The whole point of having an RFC is so that users can voice their opinions. It's not like we are shipping this tomorrow. After listening to various perspectives shared by developers, the core team revised the RFC accordingly putting everybody finally at ease. Guillaume Chau, a member of the Vue core team, put out a clear and concise plan of action on Twitter to which people are responding positively. This plan reassured that the Object API will not be deprecated until the community stops using it and the proposed API will be first offered as a standalone plugin for Vue 2.x. https://twitter.com/Akryum/status/1143114880960126976 Some developers have also started to try out the new API: https://twitter.com/igor_randj/status/1143302939496370177 https://twitter.com/cmsalvado/status/1143230023089786880 Closing thoughts Open source programmers put their time and efforts in building software that helps the community and an RFC (request for comments) is a way for the community to get involved in building high quality software at scale. Through RFC you can share your constructive feedback on why a change is necessary or is not necessary. And, all this can be done in a respectful way. This showed us a very good example of how an RFC should really work. Publishing an RFC, discussing with the community, listening to the community, and deciding collectively what to do next. Despite some hiccups in communication, the Vue core team did a good job in engaging with the community to develop the roadmap for the function-based component API in Vue. Read the RFC for function-based component API for more details. Vue 2.6 is now out with a new unified syntax for slots, and more Learning Vue in 2019 with Anthony Gore’s developer knowledge map Evan You shares Vue 3.0 updates at VueConf Toronto 2018
Read more
  • 0
  • 0
  • 26456

article-image-highest-paying-data-science-jobs-2017
Amey Varangaonkar
27 Nov 2017
10 min read
Save for later

Highest Paying Data Science Jobs in 2017

Amey Varangaonkar
27 Nov 2017
10 min read
It is no secret that this is the age of data. More data has been created in the last 2 years than ever before. Within the dumps of data created every second, businesses are looking for useful, action worthy insights which they can use to enhance their processes and thereby increase their revenue and profitability. As a result, the demand for data professionals, who sift through terabytes of data for accurate analysis and extract valuable business insights from it, is now higher than ever before. Think of Data Science as a large tree from which all things related to data branch out - from plain old data management and analysis to Big Data, and more.  Even the recently booming trends in Artificial Intelligence such as machine learning and deep learning are applied in many ways within data science. Data science continues to be a lucrative and growing job market in the recent years, as evidenced by the graph below: Source: Indeed.com In this article, we look at some of the high paying, high trending job roles in the data science domain that you should definitely look out for if you’re considering data science as a serious career opportunity. Let’s get started with the obvious and the most popular role. Data Scientist Dubbed as the sexiest job of the 21st century, data scientists utilize their knowledge of statistics and programming to turn raw data into actionable insights. From identifying the right dataset to cleaning and readying the data for analysis, to gleaning insights from said analysis, data scientists communicate the results of their findings to the decision makers. They also act as advisors to executives and managers by explaining how the data affects a particular product or process within the business so that appropriate actions can be taken by them. Per Salary.com, the median annual salary for the role of a data scientist today is $122,258, with a range between $106,529 to $137,037. The salary is also accompanied by a whole host of benefits and perks which vary from one organization to the other, making this job one of the best and the most in-demand, in the job market today. This is a clear testament to the fact that an increasing number of businesses are now taking the value of data seriously, and want the best talent to help them extract that value. There are over 20,000 jobs listed for the role of data scientist, and the demand is only growing. Source: Indeed.com To become a data scientist, you require a bachelor’s or a master’s degree in mathematics or statistics and work experience of more than 5 years in a related field. You will need to possess a unique combination of technical and analytical skills to understand the problem statement and propose the best solution, good programming skills to develop effective data models, and visualization skills to communicate your findings with the decision makers. Interesting in becoming a data scientist? Here are some resources to help you get started: Principles of Data Science Data Science Algorithms in a Week Getting Started with R for Data Science [Video] For a more comprehensive learning experience, check out our skill plan for becoming a data scientist on Mapt, our premium skills development platform. Data Analyst Probably a term you are quite familiar with, Data Analysts are responsible for crunching large amounts of data and analyze it to come to appropriate logical conclusions. Whether it’s related to pure research or working with domain-specific data, a data analyst’s job is to help the decision-makers’ job easier by giving them useful insights. Effective data management, analyzing data, and reporting results are some of the common tasks associated with this role. How is this role different than a data scientist, you might ask. While data scientists specialize in maths, statistics and predictive analytics for better decision making, data analysts specialize in the tools and components of data architecture for better analysis. Per Salary.com, the median annual salary for an entry-level data analyst is $55,804, and the range usually falls between $50,063 to $63,364 excluding bonuses and benefits. For more experienced data analysts, this figure rises to around a mean annual salary of $88,532. With over 83,000 jobs listed on Indeed.com, this is one of the most popular job roles in the data science community today. This profile requires a pretty low starting point, and is justified by the low starting salary packages. As you gain more experience, you can move up the ladder and look at becoming a data scientist or a data engineer. Source: Indeed.com You may also come across terms such as business data analyst or simply business analyst which are sometimes interchangeably used with the role of a data analyst. While their primary responsibilities are centered around data crunching, business analysts model company infrastructure, while data analysts model business data structures. You can find more information related to the differences in this interesting article. If becoming a data analyst is something that interests you, here are some very good starting points: Data Analysis with R Python Data Analysis, Second Edition Learning Python Data Analysis [Video] Data Architect Data architects are responsible for creating a solid data management blueprint for an organization. They are primarily responsible for designing the data architecture and defining how data is stored, consumed and managed by different applications and departments within the organization. Because of these critical responsibilities, a data architect’s job is a very well-paid one. Per Salary.com, the median annual salary for an entry-level data architect is $74,809, with a range between $57,964 to $91,685. For senior-level data architects, the median annual salary rises up to $136,856, with a range usually between $121,969 to $159,212. These high figures are justified by the critical nature of the role of a data architect - planning and designing the right data infrastructure after understanding the business considerations to get the most value out of the data. At present, there are over 23,000 jobs for the role listed on Indeed.com, with a stable trend in job seeker interest, as shown: Source: Indeed.com To become a data architect, you need a bachelor’s degree in computer science, mathematics, statistics or a related field, and loads of real-world skills to qualify for even the entry-level positions. Technical skills such as statistical modeling, knowledge of languages such as Python and R, database architectures, Hadoop-based skills, knowledge of NoSQL databases, and some machine learning and data mining are required to become a data architect. You also need strong collaborative skills, problem-solving, creativity and the ability to think on your feet, to solve the trickiest of problems on the go. Suffice to say it’s not an easy job, but it is definitely a lucrative one! Get ahead of the curve, and start your journey to becoming a data architect now: Big Data Analytics Hadoop Blueprints PostgreSQL for Data Architects Data Engineer Data engineers or Big Data engineers are a crucial part of the organizational workforce and work in tandem with data architects and data scientists to ensure appropriate data management systems are deployed and the right kind of data is being used for analysis. They deal with messy, unstructured Big Data and strive to provide clean, usable data to the other teams within the organization. They build high-performance analytics pipelines and develop set of processes for efficient data mining. In many companies, the role of a data engineer is closely associated with that of a data architect. While an architect is responsible for the planning and designing stages of the data infrastructure project, a data engineer looks after the construction, testing and maintenance of the infrastructure. As such data engineers tend to have a more in-depth understanding of different data tools and languages than data architects. There are over 90,000 jobs listed on Indeed.com, suggesting there is a very high demand in the organizations for this kind of a role. An entry level data engineer has a median annual salary of $90,083 per Payscale.com, with a range of $60,857 to $131,851. For Senior Data Engineers, the average salary shoots up to $123,749 as per Glassdoor estimates. Source: Indeed.com With the unimaginable rise in the sheer volume of data, the onus is on the data engineers to build the right systems that empower the data analysts and data scientists to sift through the messy data and derive actionable insights from it. If becoming a data engineer is something that interests you, here are some of our products you might want to look at: Real-Time Big Data Analytics Apache Spark 2.x Cookbook Big Data Visualization You can also check out our detailed skill plan on becoming a Big Data Engineer on Mapt. Chief Data Officer There is a countless number of organizations that build their businesses on data, but don’t manage it that well. This is where a senior executive popularly known as the Chief Data Officer (CDO) comes into play - bearing the responsibility for implementing the organization’s data and information governance and assisting with data-driven business strategies. They are primarily responsible for ensuring that their organization gets the most value out of their data and put appropriate plans in place for effective data quality and its life-cycle management. The role of a CDO is one of the most lucrative and highest paying jobs in the data science frontier. An average median annual pay for a CDO per Payscale.com is around $192,812. Indeed.com lists just over 8000 job postings too - this is not a very large number, but understandable considering the recent emergence of the role and because it’s a high profile, C-suite job. Source: Indeed.com According to a Gartner research, almost 50% companies in a variety of regulated industries will have a CDO in place, by 2017. Considering the demand for the role and the fact that it is only going to rise in the future, the role of a CDO is one worth vying for.   To become a CDO, you will obviously need a solid understanding of statistical, mathematical and analytical concepts. Not just that, extensive and high-value experience in managing technical teams and information management solutions is also a prerequisite. Along with a thorough understanding of the various Big Data tools and technologies, you will need strong communication skills and deep understanding of the business. If you’re planning to know more about how you can become a Chief Data Officer, you can browse through our piece on the role of CDO. Why demand for data science professionals will rise It’s hard to imagine an organization which doesn’t have to deal with data, but it’s harder to imagine the state of an organization with petabytes of data and not knowing what to do with it. With the vast amounts of data, organizations deal with these days, the need for experts who know how to handle the data and derive relevant and timely insights from it is higher than ever. In fact, IBM predicts there’s going to be a severe shortage of data science professionals, and thereby, a tremendous growth in terms of job offers and advertised openings, by 2020. Not everyone is equipped with the technical skills and know-how associated with tasks such as data mining, machine learning and more. This is slowly creating a massive void in terms of talent that organizations are looking to fill quickly, by offering lucrative salaries and added benefits. Without the professional expertise to turn data into actionable insights, Big Data becomes all but useless.      
Read more
  • 0
  • 0
  • 26415
article-image-obfuscating-command-and-control-c2-servers-securely-with-redirectors-tutorial
Amrata Joshi
16 Jan 2019
11 min read
Save for later

Obfuscating Command and Control (C2) servers securely with Redirectors [Tutorial]

Amrata Joshi
16 Jan 2019
11 min read
A redirector server is responsible for redirecting all the communication to the C2 server. Let's explore the basics of redirector using a simple example. Take a scenario in which we have already configured our team server and we're waiting for an incoming Meterpreter connection on port 8080/tcp. Here, the payload is delivered to the target and has been executed successfully. This article is an excerpt taken from the book Hands-On Red Team Tactics written by Himanshu Sharma and Harpreet Singh. This book covers advanced methods of post-exploitation using Cobalt Strike and introduces you to Command and Control (C2) servers and redirectors. In this article, you will understand the basics of redirectors, the process of obfuscating C2 securely, domain fronting and much more. To follow are the things that will happen next: On payload execution, the target server will try to connect to our C2 on port 8080/tcp. Upon successful connection, our C2 will send the second stage as follows: A Meterpreter session will then open and we can access this using Armitage: However, the target server's connection table will have our C2s IP in it. This means that the monitoring team can easily get our C2 IP and block it: Here's the current situation. This is displayed in an architectural format in order to aid understanding: To protect our C2 from being burned, we need to add a redirector in front of our C2. Refer to the following image for a clear understanding of this process: This is currently the IP information of our redirector and C2: Redirector IP: 35.153.183.204 C2 IP: 54.166.109.171 Assuming that socat is installed on the redirector server, we will execute the following command to forward all the communications on the incoming port 8080/tcp to our C2: Our redirector is now ready. Now let's generate a one-liner payload with a small change. This time, the lhost will be set to the redirector IP instead of the C2: Upon execution of the payload, the connection will initiate from the target server and the server will try to connect with the redirector: We might now notice something different about the following image as the source IP is redirector instead of the target server: Let's take a look at the connection table of the target server: The connection table doesn't have our C2 IP and neither does the Blue team. Now the redirector is working perfectly, what could be the issue with this C2-redirector setup? Let's perform a port scan on the C2 to check the available open ports: As we can see from the preceding screenshot, port 8080/tcp is open on our C2. This means that anyone can try to connect to our listener in order to confirm its existence. To avoid situations like this, we should configure our C2 in such a way that allows us to protect it from outside reconnaissance (recon) and attacks. Obfuscating C2 securely To put it in a diagrammatic format, our current C2 configuration is this: If someone tries to connect to our C2 server, they will be able to detect that our C2 server is running a Meterpreter handler on port 8080/tcp: To protect our C2 server from outside scanning and recon, let's set the following Uncomplicated Firewall (UFW) ruleset so that only our redirector can connect to our C2. To begin, execute the following UFW commands to add firewall rules for C2: sudo ufw allow 22 sudo ufw allow 55553 sudo ufw allow from 35.153.183.204 to any port 8080 proto tcp sudo ufw allow out to 35.153.183.204 port 8080 proto tcp sudo ufw deny out to any The given commands needs to be executed and the result is shown in the following screenshot: In addition, execute the following ufw commands to add firewall rules for redirector as well: sudo ufw allow 22 sudo ufw allow 8080 The given commands needs to be executed and the result is shown in the following screenshot: Once the ruleset is in place, this can be described as follows: If we try to perform a port scan on the C2 now, the ports will be shown as filtered: as shown below. Furthermore, our C2 is only accessible from our redirector now. Let's also confirm this by doing a port scan on our C2 from redirector server: Short-term and long-term redirectors Short-term (ST)—also called short haul—C2 are those C2 servers on which the beaconing process will continue. Whenever a system in the targeted organization executes our payload, the server will connect with the ST-C2 server. The payload will periodically poll for tasks from our C2 server, meaning that the target will call back to the ST-C2 server every few seconds. The redirector placed in front of our ST-C2 server is called the short-term (ST) redirector. This is responsible for handling ST-C2 server connections on which the ST-C2 will be used for executing commands on the target server in real time. ST and LT redirectors would get caught easily during the course of engagement because they're placed at the front. Long-term (LT)—also known as long-haul—C2 server is where the callbacks received from the target server will be after every few hours or days. The redirector placed in front of our LT-C2 server is called a long-term (LT) redirector. This redirector is used to maintain access for a longer period of time than ST redirectors. When performing persistence via the ST-C2 server, we need to provide the domain of our LT redirector so that the persistence module running on the target server will connect back to the LT redirector instead of the ST redirector. A segregated red team infrastructure setup would look something like this: Source: https://payatu.com/wp-content/uploads/2018/08/redteam_infra.png Once we have a proper red team infrastructure setup, we can focus on the kind of redirection we want to have in our ST and LT redirectors. Redirection methods There are two ways in which we can perform redirection: Dumb pipe redirection Filtration/smart redirection Dumb pipe redirection The dumb pipe redirectors blindly forward the network traffic from the target server to our C2, or vice-versa. This type of redirector is useful for quick configuration and setup, but they lack a level of control over the incoming traffic. Dumb pipe redirection will obfuscate (hide) the real IP of our C2, but won't it distract the defenders of the organization from investigating our setup. We can perform dumb pipe redirection using socat or iptables. In both cases, the network traffic will be redirected either to our ST-C2 server or LT-C2 server. Source: https://payatu.com/wp-content/uploads/2018/08/dumb_pipe_redirection123.png Let's execute the command given in the following image in order to configure a dumb pipe redirector which would redirect to our C2 on port 8080/tcp: Following are the commands that we can execute to perform dumb pipe redirection using iptables: iptables -I INPUT -p tcp -m tcp --dport 8080 -j ACCEPT iptables -t nat -A PREROUTING -p tcp --dport 8080 -j DNAT --to-destination 54.166.109.171:8080 iptables -t nat -A POSTROUTING -j MASQUERADE iptables -I FORWARD -j ACCEPT iptables -P FORWARD ACCEPT sysctl net.ipv4.ip_forward=1 The given commands needs to be executed and the result is shown in the following screenshot: (Ignore the sudo error here. This has occurred because of the hostname that we changed) Using socat or iptables, the result would be same i.e. the network traffic on the redirector's interface will be forwarded to our C2. Filtration/smart redirection Filtration redirection, also known as smart redirection, doesn't just blindly forward the network traffic to the C2. Smart redirection will always process the network traffic based on the rules defined by the red team before forwarding it to the C2. In a smart redirection, if the C2 traffic is invalid, the network traffic will either be forwarded to a legitimate website or it would just drop the packets. Only if the network traffic is for our C2 will the redirection work accordingly: To configure a smart redirection, we need to install a web service and configure it. Let's install Apache server on the redirector using the sudo apt install apache2 command: We need to execute the following commands as well in order to enable Apache modules to be rewritten, and also to enable SSL: sudo apt-get install apache2 sudo a2enmod ssl rewrite proxy proxy_http sudo a2ensite default-ssl.conf sudo service apache2 restart These are all commands that needs to be executed. The result of the executed commands are shown in the following screenshot: We also need to configure the Apache from its configuration: We need to look for the Directory directive in order to change the AllowOverride from None to All so that we can use our custom .htaccess file for web request filtration. We can now set up the virtual host setting and add this to wwwpacktpub.tk (/etc/apache2/sites-enabled/default-ssl.conf): After this, we can generate the payload with a domain such as wwwpacktpub.tk in order to get a connection. Domain fronting According to https://resources.infosecinstitute.com/domain-fronting/: Domain fronting is a technique that is designed to circumvent the censorship employed for certain domains (censorship may occur for domains that are not in line with a company's policies, or they may be a result of the bad reputation of a domain). Domain fronting works at the HTTPS layer and uses different domain names at different layers of the request (more on this later). To the censors, it looks like the communication is happening between the client and a permitted domain. However, in reality, communication might be happening between the client and a blocked domain. To make a start with domain fronting, we need to get a domain that is similar to our target organization. To check for domains, we can use the domainhunter tool. Let's clone the repository to continue: We need to install some required Python packages before continuing further. This can be achieved by executing the pip install -r requirements.txt command as follows: After installation, we can run the tool by executing the python domainhunter.py command as follows: By default, this will fetch for the expired and deleted domains that have a blank name because we didn't provide one: Let's check for the help option to see how we can use domainhunter: Let's search for a keyword to look for the domains related to the specified keyword. In this case, we will use packtpub as the desired keyword: We just found out that wwwpacktpub.com is available. Let's confirm its availability at domain searching websites as follows: This confirms that the domain is available on name.com and even on dot.tk for almost $8.50: Let's see if we can find a free domain with a different TLD: We have found that the preceding-mentioned domains are free to register. Let's select wwwpacktpub.tk as follows: We can again check the availability of www.packtpub.tk and obtain this domain for free: In the preceding setting, we need to set our redirector's IP address in the Use DNS field: Let's review the purchase and then check out: Our order has now been confirmed. We just obtained wwwpacktpub.tk: Let's execute the dig command to confirm our ownership of this: The dig command resolves wwwpacktpub.tk to our redirector's IP. Now that we have obtained this, we can set the domain in the stager creation and get the back connection from wwwpacktpub.tk: In this article, we have learned the basics of redirectors and we have also covered how we can obfuscate C2s in a secure manner so that we can protect our C2s from getting detected by the Blue team.  This article also covered short-term and long-term C2s and much more. To know more about advanced penetration testing tools and more check out the book Hands-On Red Team Tactics written by Himanshu Sharma and Harpreet Singh. Introducing numpywren, a system for linear algebra built on a serverless architecture Fortnite server suffered a minor outage, Epic Games was quick to address the issue Windows Server 2019 comes with security, storage and other changes
Read more
  • 0
  • 0
  • 26388

article-image-ios-12-top-choice-for-app-developers-security
Guest Contributor
17 Dec 2019
6 min read
Save for later

Why is iOS 12 a top choice for app developers when it comes to security

Guest Contributor
17 Dec 2019
6 min read
When it comes to mobile operating systems, iOS 12 is generally considered to be one of the most secure — if not the leader — in mobile security. It's now a little more than a year old, and its features may be a bit overshadowed by the launch of iOS 13. Still, a considerable number of devices run iOS 12, and developers should know about its security features. Further Reading If you want to build iOS 12 applications from scratch with the latest Swift 4.2 language and Xcode 10, explore our book iOS 12 Programming for Beginners by Craig Clayton.  For beginners, this book starts by introducing you to iOS development as you learn Xcode 10 and Swift 4.2. You'll also study advanced iOS design topics, such as gestures and animations. The book also details new iOS 12 features, such as the latest in notifications, custom-UI notifications, maps, and the recent additions in Sirikit.  Below are the most prominent changes iOS 12 made in terms of security. Based on these changes, app developers can take advantage of several safety features if they want to build secure mobile apps for devices running on this OS. Major security features in iOS 12 iOS 12's biggest security upgrades were primarily outright new features. In general, these changes reflected a pivot towards privacy, i.e., giving users more control over how their data can be collected and used, as well as towards better password and device security. Default updating: Automatic software updates are now turned on by default. This feature is good news for developers — if they need to push an update that patches a major security flaw, most users will update to the more secure version of the app automatically. Users are also likely to have the most secure version of first-party apps and iOS 12. Password auditing: iOS 12's password auditing tools let users know when they've used the same password more than twice — devices themselves now encourage users to create strong and secure passwords when logging into their apps. The OS keeps a record of all passwords a user creates and stores them on the iCloud. While this feature may not sound particularly secure — especially considering iCloud's discovered security flaw last year — all these passwords are encrypted with AES-256. USB connection: If a user hasn't unlocked a device running iOS 12 in more than an hour, USB devices won't be able to connect. Safari upgrades: The mobile version of Safari will now, by default, prevent websites from using tracking cookies without explicit user permission. 2FA integration: iOS 12 offers better native integration with two-factor authentication (2FA). If an app uses 2FA and sends a security code to a user's phone over text, iOS 12 can autofill the security code field for the user. This may be a good reason for developers to consider implementing 2FA functionality if their apps don't already support it. Improvements in iOS 12 specific to app developers Other changes in iOS 12 were more subtle to end-users but more relevant to app developers. Automated password generation: Since iOS 11, developers have been able to label their password and username fields, allowing users to automatically populate these fields with saved passwords and usernames for a specific app or Safari webpage. With iOS 12's new functionality, users can have iOS 12 generate a unique, strong password that fills the password field once prompted by an app. In-house business app development: Apple now supports the development of in-house business apps. Businesses that partner with Apple through the Apple Developer Enterprise Program can develop apps that work only on specific, permitted devices. Sandboxed apps: By default, all third-party apps are now sandboxed and cannot directly access files modified by other apps. If an app needs to alter files outside of its specific home directory — which is randomly assigned by iOS 12 on install — it will do so through iOS. The same is true for all system files and resources. If an app needs to run a background process, it can do so only through system-provided APIs. Content sharing: Apps created by the same developer can share content — like user preferences and stored data — with each other when configured to be part of an App Group. App frameworks: New software development frameworks like HomeKit are now available to developers working with iOS 12. HomeKit allows developers to create apps that configure or otherwise communicate with smart home appliances and IoT devices. Likewise, SiriKit lets developers update their apps to work with user requests that originate from Siri and Maps. Editor’s tip: To learn more about SirKit you can through Chapter 24 of the book iOS 12 Programming for Beginners by Packt Publishing. Handoff: iOS 12's new Handoff feature allows developers to design apps and websites so that users can use an app one device, then seamlessly transfer their activity to another. The feature will be useful for developers working on apps that also have web versions. App Store review guideline updates with iOS 12 Along with the launch of iOS 12 came some changes to the App Store review guidelines. App developers will need to be aware of these if they want to continue developing programs for iOS devices. Apple now limits the amount of data, developers can collect from user's address books — and how apps are allowed to use this data. This fact doesn't bar developers from using an iPhone's address book to add social functionality to their apps. Developers can still scan a user's contact lists to allow users to send invites or to link users up with friends who also use a specific app. Developers, however, can't maintain and transfer databases of user address information. Apple also banned the selling of user info to third parties. Some tech analysts consider this a response to the Cambridge Analytica scandal of last year, as well as growing discontentment over how large companies were collecting and using user data. Depending on how a developer plans on using user data, these guidelines may not bring about huge changes. However, app designers may want to review what data collection is allowed and how they can use that data. Over time, iOS security updates have trended towards giving users more control over their data, apps less control over the system and developers more APIs for adding specific functionality. Following that trend, iOS 12 is built with user security in mind. For developers, implementing security features will be easier than it has been in the past — and they can also feel more confident that the devices accessing their app are secure. Some of these changes make apps more secure for developers — like the addition of password auditing and better 2FA authentication. Others, like app sandboxing and the updates to the app store review guidelines, may require more planning from app developers than Apple has asked for in the past. To start building iOS 12 applications of your own with Xcode 10 and Swift 4.2, the building blocks of iOS development, read the book iOS 12 Programming for Beginners by Packt Publishing. Author Bio Kayla Matthews writes about big data, cybersecurity, and technology. You can find her work on The Week, Information Age, KDnuggets and CloudTweaks, or over at ProductivityBytes.com.
Read more
  • 0
  • 0
  • 26354

article-image-google-released-a-paper-showing-how-its-fighting-disinformation-on-its-platforms
Prasad Ramesh
26 Feb 2019
5 min read
Save for later

Google released a paper showing how it’s fighting disinformation on its platforms

Prasad Ramesh
26 Feb 2019
5 min read
Last Saturday, Google presented a paper in the Munich Security Conference titled How Google Fights Disinformation. In the paper, they explain what steps they’re taking against disinformation and detail their strategy for their platforms Google Search, News, YouTube, and Google Ads. We take a look at the key strategies that Google is taking against disinformation. Disinformation has become widespread in recent years. It directly affects Google’s mission of organizing the world’s information and making it accessible. Disinformation, misinformation, or fake new are deliberate attempts by acting parties to mislead people in believing things that aren’t true by spreading such content over the internet. Disinformation is deliberate attempts to mislead people where the creator knows that the information is false, misinformation is where the creator has their facts wrong and spreads wrong information unintentionally. The motivations behind it can be financial, political, or just for entertainment (trolls). Motivations can overlap with the content produced, moreover, the disinformation could also be for a good cause, making the fight against fake news very complex. A common solution for all platforms is not possible as different platforms pose different challenges. Making standards that exercise deep deliberation for individual cases is also not practical. There are three main principles that Google is outlining to combat disinformation, shown as follows. #1 Make quality content count Google products sort through a lot of information to display the most useful content first. They want to deliver quality content and legitimate commercial messages are prone to rumors. While the content is different on different Google platforms, the principles are similar: Organizing information by ranking algorithms. The algorithms are aimed to ensure that the information benefits users and is measured by user testing #2 Counter malicious actors Algorithms cannot determine if a piece of content is true or false based on current events. Neither can it determine the true intents of the content creator. For this, Google products have policies that prohibit certain behaviors like misinterpreting ownership of content. Certain users try to get a better ranking by practicing spam, such behavior is also shown by people who engage in spreading disinformation. Google has algorithms in place that can reduce such content and it’ll also be supported by human reviews for further filtering. #3 Giving users more choices Giving users different perspectives is important before they choose a link and proceed reading content or viewing a video. Hence, Google provides multiple links for a topic searched. Google search and other products now have additional UI elements to segregate information into different sections for an organized view of content. They also have a feedback button on their services via which users can submit their thoughts. Partnership with external experts Google cannot do this alone, hence they have partnered with supporting new organizations to create quality content that can uproot disinformation. They mention in the paper: “In March 2018, we launched the Google News Initiative (GNI) 3 to help journalism thrive in the digital age. With a $300 million commitment over 3 years, the initiative aims to elevate and strengthen quality journalism.” Preparing for the future People who create fake news will always try new methods to propagate it. Google is investing in research and development against it, now especially before the elections. They intend to stay ahead of the malicious actors who may use new technologies or tactics which can include deepfakes. They want to protect so that polling booths etc are easily available, guard against phishing, mitigate DDoS attacks on political websites. YouTube and conspiracy theories Recently, there have been a lot of conspiracy theories floating around on YouTube. In the paper, they say that: “YouTube has been developing products that directly address a core vulnerability involving the spread of disinformation in the immediate aftermath of a breaking news event.” Making a legitimate video with correct facts takes time, while disinformation can be created quickly for spreading panic/negativity etc,. In conclusion they, note that “fighting disinformation is not a straightforward endeavor. Disinformation and misinformation can take many shapes, manifest differently in different products, and raise significant challenges when it comes to balancing risks of harm to good faith, free expression, with the imperative to serve users with information they can trust.” Public reactions People think that only the platforms themselves can take actions against disinformation propaganda. https://twitter.com/halhod/status/1097640819102691328 Users question Google’s efforts in cases where the legitimate website is shown after the one with disinformation with an example of Bitcoin. https://twitter.com/PilotDaveCrypto/status/1097395466734653440 Some speculate that corporate companies should address their own bias of ranking pages first: https://twitter.com/PaulJayzilla/status/1097822412815646721 https://twitter.com/Darin_T80/status/1097203275483426816 To read the complete research paper with Google product-specific details on fighting disinformation, you can head on to the Google Blog. Fake news is a danger to democracy. These researchers are using deep learning to model fake news to understand its impact on elections. Defending Democracy Program: How Microsoft is taking steps to curb increasing cybersecurity threats to democracy Is Anti-trust regulation coming to Facebook following fake news inquiry made by a global panel in the House of Commons, UK?
Read more
  • 0
  • 0
  • 26350
article-image-github-universe-2019-github-for-mobile-github-archive-program-and-more-announced-amid-protests-against-githubs-ice-contract
Vincy Davis
14 Nov 2019
4 min read
Save for later

GitHub Universe 2019: GitHub for mobile, GitHub Archive Program and more announced amid protests against GitHub’s ICE contract

Vincy Davis
14 Nov 2019
4 min read
Yesterday, GitHub commenced its popular product conference GitHub Universe 2019 in San Francisco. The two-day annual conference celebrates the contribution of GitHub’s 40+ million developers and their contributions to the open source community. Day 1 of the conference had many interesting announcements like GitHub for mobile, GitHub Archive Program, and more. Let’s look at some of the major announcements at the GitHub Universe 2019 conference. GitHub for mobile iOS (beta) Github for mobile is a beta app that aims to give users the flexibility to work and interact with the team, anywhere they want. This will enable users to share feedback on a design discussion or review codes in a non-complex development environment. This native app will adapt to any screen size and will also work in dark mode based on the device preference. Currently available only on iOS, the GitHub team has said that it will soon come up with the Android version of it. https://twitter.com/italolelis/status/1194929030518255616 https://twitter.com/YashSharma___/status/1194899905552105472 GitHub Archive Program “Our world is powered by open source software. It’s a hidden cornerstone of our civilization and the shared heritage of all humanity. The mission of the GitHub Archive Program is to preserve it for generations to come,” states the official GitHub blog. GitHub has partnered with the Stanford Libraries, the Long Now Foundation, the Internet Archive, the Software Heritage Foundation, Piql, Microsoft Research, and the Bodleian Library to preserve all the available open source code in the world. It will safeguard all the data by storing multiple copies across various data formats and locations. This includes a “very-long-term archive” called the GitHub Arctic Code Vault which is designed to last at least 1,000 years. https://twitter.com/vithalreddy/status/1194846571835183104 https://twitter.com/sonicbw/status/1194680722856042499 Read More: GitHub Satellite 2019 focuses on community, security, and enterprise Automating workflows from code to cloud General availability of GitHub Actions Last year, at the GitHub Universe conference, GitHub Actions was announced in beta. This year, GitHub has made it generally available to all the users. In the past year, GitHub Actions has received contributions from the developers of AWS, Google, and others. Actions has now developed as a new standard for building and sharing automation for software development, including a CI/CD solution and native package management.GitHub has also announced the free use of self-hosted runners and artifact caching. https://twitter.com/qmixi/status/1194379789483704320 https://twitter.com/inversemetric/status/1194668430290345984 General availability of GitHub Packages In May this year, GitHub had announced the beta version of the GitHub Package Registry as its new package management service. Later in September, after gathering community feedback, GitHub announced that the service has proxy support for the primary npm registry. Since its launch, GitHub Package has received over 30,000 unique packages that served the needs of over 10,000 organizations. Now, at the GitHub Universe 2019, the GitHub team has announced the general availability of GitHub Packages and also informed that they have added support for using the GitHub Actions token. https://twitter.com/Chris_L_Ayers/status/1194693253532020736 These were some of the major announcements at day 1 of the GitHub Universe 2019 conference, head over to GitHub’s blog for more details of the event. Tech workers protests against GitHub’s ICE contract Major product announcements aside, one thing that garnered a lot of attention at the GitHub Universe conference was the protest conducted by the GitHub workers along with the Tech Workers Coalition to oppose GitHub’s $200,000 contract with Immigration and Customs Enforcement (ICE). Many high-profile speakers have dropped out of the GitHub Universe 2019 conference and at least five GitHub employees have resigned from GitHub due to its support for ICE. https://twitter.com/lily_dart/status/1194216293668401152 Read More: Largest ‘women in tech’ conference, Grace Hopper Celebration, renounces Palantir as a sponsor due to concerns over its work with the ICE Yesterday at the event, the protesting tech workers brought a giant cage to symbolize how ICE uses them to detain migrant children. https://twitter.com/githubbers/status/1194662876587233280 Tech workers around the world have extended their support to the protest against GitHub. https://twitter.com/ConMijente/status/1194665524191318016 https://twitter.com/CoralineAda/status/1194695061717450752 https://twitter.com/maybekatz/status/1194683980877975552 GitHub along with Weights & Biases introduced CodeSearchNet challenge evaluation and CodeSearchNet Corpus GitHub acquires Semmle to secure open-source supply chain; attains CVE Numbering Authority status GitHub Package Registry gets proxy support for the npm registry GitHub updates to Rails 6.0 with an incremental approach GitHub now supports two-factor authentication with security keys using the WebAuthn API
Read more
  • 0
  • 0
  • 26312

article-image-deploying-openstack-devops-way
Packt
07 Feb 2017
16 min read
Save for later

Deploying OpenStack – the DevOps Way

Packt
07 Feb 2017
16 min read
In this article by Chandan Dutta Chowdhury, Omar Khedher, authors of the book Mastering OpenStack - Second Edition, we will cover the Deploying an OpenStack environment based on the profiled design. Although we created our design by taking care of several aspects related to scalability and performance, we still have to make it real. If you are still looking at OpenStack as a single block system. (For more resources related to this topic, see here.) Furthermore, in the introductory section of this article, we covered the role of OpenStack in the next generation of data centers. A large-scale infrastructure used by cloud providers with a few thousand servers needs a very different approach to set up. In our case, deploying and operating the OpenStack cloud is not as simple as you might think. Thus, you need to make the operational task easier or, in other words, automated. In this article, we will cover new topics about the ways to deploy OpenStack. The next part will cover the following points: Learning what the DevOps movement is and how it can be adopted in the cloud Knowing how to see your infrastructure as code and how to maintain it Getting closer to the DevOps way by including configuration management aspects in your cloud Making your OpenStack environment design deployable via automation Starting your first OpenStack environment deployment using Ansible DevOps in a nutshell The term DevOps is a conjunction of development (software developers) and operations (managing and putting software into production). Many IT organizations have started to adopt such a concept, but the question is how and why? Is it a job? Is it a process or a practice? DevOps is development and operations compounded, which basically defines a methodology of software development. It describes practices that streamline the software delivery process. It is about raising communication and integration between developers, operators (including administrators), and quality assurance teams. The essence of the DevOps movement lies in leveraging the benefits of collaboration. Different disciplines can relate to DevOps in different ways and bring their experiences and skills together under the DevOps banner to gain shared values. So, DevOps is a methodology that integrates the efforts of several disciplines, as shown in the following figure: This new movement is intended to resolve the conflict between developers and operators. Delivering a new release affects the production systems. It puts different teams in conflicting positions by setting different goals for them, for example, the development team wants the their latest code to go live while the operations team wants more time to test and stage the changes before going to production. DevOps fills the gap and streamlines the process of bringing in change by bringing in collaboration between the developers and operators. DevOps is neither a toolkit nor a job; it is the synergy that streamlines the process of change. Let's see how DevOps can incubate a cloud project. DevOps and cloud – everything is code Let's look at the architecture of cloud computing. While discussing a cloud infrastructure, we must remember that we are talking about a large scalable environment! The amazing switch to bigger environments requires us to simplify everything as much as possible. System architecture and software design are becoming more and more complicated. Every new release of software affords new functions and new configurations. Administering and deploying a large infrastructure would not be possible without adopting a new philosophy: infrastructure as code. When infrastructure is seen as code, the components of a given infrastructure are modeled as modules of code. What you need to do is to abstract the functionality of the infrastructure into discrete reusable components, design the services provided by the infrastructure as modules of code, and finally implement them as blocks of automation. Furthermore, in such a paradigm, it will be essential to adhere to the same well-established discipline of software development as an infrastructure developer. The essence of DevOps mandates that developers, network engineers, and operators must work alongside each other to deploy, operate, and maintain cloud infrastructure which will power our next-generation data center. DevOps and OpenStack OpenStack is an open source project, and its code is extended, modified, and fixed in every release. It is composed of multiple projects and requires extensive skills to deploy and operate. Of course, it is not our mission to check the code and dive into its different modules and functions. So what can we do with DevOps, then? Deploying complex software on a large-scale infrastructure requires adopting new strategy. The ever-increasing complexity of software such as OpenStack and deployment of huge cloud infrastructure must be simplified. Everything in a given infrastructure must be automated! This is where OpenStack meets DevStack. Breaking down OpenStack into pieces Let's gather what we covered previously and signal a couple of steps towards our first OpenStack deployment: Break down the OpenStack infrastructure into independent and reusable services. Integrate the services in such a way that you can provide the expected functionalities in the OpenStack environment. It is obvious that OpenStack includes many services, What we need to do is see these services as packages of code in our infrastructure as code experience. The next step will investigate how to integrate the services and deploy them via automation. Deploying service as code is similar to writing a software application. Here are important points you should remember during the entire deployment process: Simplify and modularize the OpenStack services Develop OpenStack services as building blocks which integrate with other components to provide a complete system Facilitate the customization and improvement of services without impacting the complete system. Use the right tool to build the services Be sure that the services provide the same results with the same input Switch your service vision from how to do it to what we want to do Automation is the essence of DevOps. In fact, many system management tools are intensely used nowadays due to their efficiency of deployment. In other words, there is a need for automation! You have probably used some of the automation tools, such as Ansible, Chef, Puppet, and many more. Before we go through them, we need to create a succinct, professional code management step. Working with the infrastructure deployment code While dealing with infrastructure as code, the code that abstracts, models, and builds the OpenStack infrastructure must be committed to source code management. This is required for tracking changes in our automation code and reproducibility of results. Eventually, we must reach a point where we shift our OpenStack infrastructure from a code base to a deployed system while following the latest software development best practices. At this stage, you should be aware of the quality of your OpenStack infrastructure deployment, which roughly depends on the quality of the code that describes it. It is important to highlight a critical point that you should keep in mind during all deployment stages: automated systems are not able to understand human error. You'll have to go through an ensemble of phases and cycles using agile methodologies to end up with a release that is a largely bug-free to be promoted to the production environment. On the other hand, if mistakes cannot be totally eradicated, you should plan for the continuous development and testing of code. The code's life cycle management is shown in the following figure: Changes can be scary! To handle changes, it is recommended that you do the following: Keep track of and monitor the changes at every stage Build flexibility into the code and make it easy to change Refactor the code when it becomes difficult to manage Test, test, and retest your code Keep checking every point that has been described previously till you start to get more confident that your OpenStack infrastructure is being managed by code that won't break. Integrating OpenStack into infrastructure code To keep the OpenStack environment working with a minimum rate of surprises and ensure that the code delivers the functionalities that are required, we must continuously track the development of our infrastructure code. We will connect the OpenStack deployment code to a toolchain, where it will be constantly monitored and tested as we continue to develop and refine our code. This toolchain is composed of a pipeline of tracking, monitoring, testing, and reporting phases and is well known as a continuous integration and continuous development (CI-CD) process. Continuous integration and delivery Let's see how continuous integration (CI) can be applied to OpenStack. The life cycle of our automation code will be managed by the following categories of tools: System Management Tool Artifact (SMTA) can be any IT automation tool juju charms. Version Control System (VCS) tracks changes to our infrastructure deployment code. Any version control system, such as CVS, Subversion, or Bazaar, that you are most familiar with can be used for this purpose. Git can be a good outfit for our VCS. Jenkins is a perfect tool that monitors to changes in the VCS and does the continuous integration testing and reporting of results. Take a look at the model in the following figure: The proposed life-cycle for infrastructure as code consists of infrastructure configuration files that are recorded in a version control system and are built continuously by the means of a CI server (Jenkins, in our case). Infrastructure configuration files can be used to set up a unit test environment (a virtual environment using Vagrant, for example) and makes use of any system management tool to provision the infrastructure (Ansible, Chef, puppet, and so on). The CI server keeps listening to changes in version control and automatically propagates any new versions to be tested, and then it listens to target environments in production. Vagrant allows you to build a virtual environment very easily; it is based on Oracle VirtualBox (https://www.virtualbox.org/) to run virtual machines, so you will need these before moving on with the installation in your test environment. The proposed life-cycle for infrastructure code highlights the importance of a test environment before moving on to production. You should give a lot of importance to and care a good deal about the testing stage, although this might be a very time-consuming task. Especially in our case, with infrastructure code for deploying OpenStack that is complicated and has multiple dependencies on other systems, the importance of testing cannot be overemphasized. This makes it imperative to ensure effort is made for an automated and consistent testing of the infrastructure code. The best way to do this is to keep testing thoroughly in a repeated way till you gain confidence about your code. Choosing the automation tool At first sight, you may wonder what the best automation tool is that will be useful for our OpenStack production day. We have already chosen Git and Jenkins to handle our continuous integration and testing. It is time to choose the right tool for automation. It might be difficult to select the right tool. Most likely, you'll have to choose between several of them. Therefore, giving succinct hints on different tools might be helpful in order to distinguish the best outfit for certain particular setups. Of course, we are still talking about large infrastructures, a lot of networking, and distributed services. Giving the chance for one or more tools to be selected, as system management parties can be effective and fast for our deployment. We will use Ansible for the next deployment phase. Introducing Ansible We have chosen Ansible to automate our cloud infrastructure. Ansible is an infrastructure automation engine. It is simple to get started with and yet is flexible enough to handle complex interdependent systems. The architecture of Ansible consists of the deployment system where Ansible itself is installed and the target systems that are managed by Ansible. It uses an agentless architecture to push changes to the target systems. This is due to the use of SSH protocol as its transport mechanism to push changes to the target systems. This also means that there is no extra software installation required on the target system. The agentless architecture makes setting up Ansible very simple. Ansible works by copying modules over SSH to the target systems. It then executes them to change the state of the target systems. Once executed, the Ansible modules are cleaned up, leaving no trail on the target system. Although the default mechanism for making changes to the client system is an SSH-based push-model, if you feel the push-based model for delivering changes is not scalable enough for your infrastructure, Ansible also supports an agent-based pull-model. Ansible is developed in python and comes with a huge collection of core automation modules. The configuration files for Ansible are called Playbooks and they are written in YAML, which is just a markup language. YAML is easier to understand; it's custom-made for writing configuration files. This makes learning Ansible automation much easier. The Ansible Galaxy is a collection of reusable Ansible modules that can be used for your project. Modules Ansible modules are constructs that encapsulate a system resource or action. A module models the resource and its attributes. Ansible comes with packages with a wide range of core modules to represent various system resources; for example, the file module encapsulates a file on the system and has attributes such as owner, group, mode, and so on. These attributes represent the state of a file in the system; by changing the attributes of the resources, we can describe the required final state of the system. Variables While modules can represent the resources and actions on a system, the variables represent the dynamic part of the change. Variables can be used to modify the behavior of the modules. Variables can be defined from the environment of the host, for example, the hostname, IP address, version of software and hardware installed on a host, and so on. They can also be user-defined or provided as part of a module. User-defined variables can represent the classification of a host resource or its attribute. Inventory An inventory is a list of hosts that are managed by Ansible. The inventory list supports classifying hosts into groups. In its simplest form, an inventory can be an INI file. The groups are represented as article on the INI file. The classification can be based on the role of the hosts or any other system management need. It is possible to have a host to appear in multiple groups in an inventory file. The following example shows a simple inventory of hosts: logserver1.example.com [controllers] ctl1.example.com ctl2.example.com [computes] compute1.example.com compute2.example.com compute3.example.com compute[20:30].example.com The inventory file supports special patterns to represent large groups of hosts. Ansible expects to find the inventory file at /etc/ansible/hosts, but a custom location can be passed directly to the Ansible command line. Ansible also supports dynamic inventory that can be generated by executing scripts or retrieved from another management system, such as a cloud platform. Roles Roles are the building blocks of an Ansible-based deployment. They represent a collection of tasks that must be performed to configure a service on a group of hosts. The Role encapsulates tasks, variable, handlers, and other related functions required to deploy a service on a host. For example, to deploy a multinode web server cluster, the hosts in the infrastructure can be assigned roles such as web server, database server, load balancer, and so on. Playbooks Playbooks are the main configuration files in Ansible. They describe the complete system deployment plan. Playbooks are composed a series of tasks and are executed from top to bottom. The tasks themselves refer to group of hosts that must be deployed with roles. Ansible playbooks are written in YAML. The following is an example of a simple Ansible Playbook: --- - hosts: webservers vars: http_port: 8080 remote_user: root tasks: - name: ensure apache is at the latest version yum: name=httpd state=latest - name: write the apache config file template: src=/srv/httpd.j2 dest=/etc/httpd.conf notify: - restart apache handlers: - name: restart apache service: name=httpd state=restarted Ansible for OpenStack OpenStack Ansible (OSA) is an official OpenStack Big Tent project. It focuses on providing roles and playbooks for deploying a scalable, production-ready OpenStack setup. It has a very active community of developers and users collaborating to stabilize and bring new features to OpenStack deployment. One of the unique features of the OSA project is the use of containers to isolate and manage OpenStack services. OSA installs OpenStack services in LXC containers to provide each service with an isolated environment. LXC is an OS-level container and it encompasses a complete OS environment that includes a separate filesystem, networking stack, and resource isolation using cgroups. OpenStack services are spawned in separate LXC containers and speak to each other using the REST APIs. The microservice-based architecture of OpenStack complements the use of containers to isolate services. It also decouples the services from the physical hardware and provides encapsulation of the service environment that forms the foundation for providing portability, high availability, and redundancy. The OpenStack Ansible deployment is initiated from a deployment host. The deployment host is installed with Ansible and it runs the OSA playbooks to orchestrate the installation of OpenStack on the target hosts: The Ansible target hosts are the ones that will run the OpenStack services. The target nodes must be installed with Ubuntu 14.04 LTS and configured with SSH-key-based authentication to allow login from the deployment host. Summary In this article, we covered several topics and terminologies on how to develop and maintain a code infrastructure using the DevOps style. Viewing your OpenStack infrastructure deployment as code will not only simplify node configuration, but also improve the automation process. You should keep in mind that DevOps is neither a project nor a goal, but it is a methodology that will make your deployment successfully empowered by the team synergy with different departments. Despite the existence of numerous system management tools to bring our OpenStack up and running in an automated way, we have chosen Ansible for automation of our infrastructure. Puppet, Chef, Salt, and others can do the job but in different ways. You should know that there isn't one way to perform automation. Both Puppet and Chef have their own OpenStack deployment projects under the OpenStack Big Tent. Resources for Article: Further resources on this subject: Introduction to Ansible [article] OpenStack Networking in a Nutshell [article] Introducing OpenStack Trove [article]
Read more
  • 0
  • 0
  • 26289
Modal Close icon
Modal Close icon