How-To Tutorials

article-image-common-wlan-protection-mechanisms-and-their-flaws

07 Mar 2016

19 min read

Common WLAN Protection Mechanisms and their Flaws

07 Mar 2016

In this article by Vyacheslav Fadyushin, the author of the book Building a Pentesting Lab for Wireless Networks, we will discuss various WLAN protection mechanisms and the flaws present in them. To be able to protect a wireless network, it is crucial to clearly understand which protection mechanisms exist and which security flaws do they have. This topic will be useful not only for those readers who are new to Wi-Fi security, but also as a refresher for experienced security specialists. (For more resources related to this topic, see here.) Hiding SSID Let's start with one of the common mistakes done by network administrators: relying only on the security by obscurity. In the frames of the current subject, it means using a hidden WLAN SSID (short for service set identification) or simply WLAN name. Hidden SSID means that a WLAN does not send its SSID in broadcast beacons advertising itself and doesn't respond to broadcast probe requests, thus making itself unavailable in the list of networks on Wi-Fi-enabled devices. It also means that normal users do not see that WLAN in their available networks list. But the lack of WLAN advertising does not mean that a SSID is never transmitted in the air—it is actually transmitted in plaintext with a lot of packet between access points and devices connected to them regardless of a security type used. Therefore, SSIDs are always available for all the Wi-Fi network interfaces in a range and visible for any attacker using various passive sniffing tools. MAC filtering To be honest, MAC filtering cannot be even considered as a security or protection mechanism for a wireless network, but it is still called so in various sources. So let's clarify why we cannot call it a security feature. Basically, MAC filtering means allowing only those devices that have MAC addresses from a pre-defined list to connect to a WLAN, and not allowing connections from other devices. MAC addresses are transmitted unencrypted in Wi-Fi and are extremely easy for an attacker to intercept without even being noticed (refer to the following screenshot): An example of wireless traffic sniffing tool easily revealing MAC addresses Keeping in mind the extreme simplicity of changing a physical address (MAC address) of a network interface, it becomes obvious why MAC filtering should not be treated as a reliable security mechanism. MAC filtering can be used to support other security mechanisms, but it should not be used as an only security measure for a WLAN. WEP Wired equivalent privacy (WEP) was born almost 20 years ago at the same time as the Wi-Fi technology and was integrated as a security mechanism for the IEEE 802.11 standard. Like it often happens with new technologies, it soon became clear that WEP contains weaknesses by design and cannot provide reliable security for wireless networks. Several attack techniques were developed by security researchers that allowed them to crack a WEP key in a reasonable time and use it to connect to a WLAN or intercept network communications between WLAN and client devices. Let's briefly review how WEP encryption works and why is it so easy to break. WEP uses so-called initialization vectors (IV) concatenated with a WLAN's shared key to encrypt transmitted packets. After encrypting a network packet, an IV is added to a packet as it is and sent to a receiving side, for example, an access point. This process is depicted in the following flowchart: The WEP encryption process An attacker just needs to collect enough IVs, which is also a trivial task using additional reply attacks to force victims to generate more IVs. Even worse, there are attack techniques that allow an attacker to penetrate WEP-protected WLANs even without connected clients, which makes those WLANs vulnerable by default. Additionally, WEP does not have a cryptographic integrity control, which makes it vulnerable not only to attacks on confidentiality. There are numerous ways how an attacker can abuse a WEP-protected WLAN, for example: Decrypt network traffic using passive sniffing and statistical cryptanalysis Decrypt network traffic using active attacks (reply attack, for example) Traffic injection attacks Unauthorized WLAN access Although WEP was officially superseded by the WPA technology in 2003, it still can be sometimes found in private home networks and even in some corporate networks (mostly belonging to small companies nowadays). But this security technology has become very rare and will not be used anymore in future, largely due to awareness in corporate networks and because manufacturers no longer activate WEP by default on new devices. In our humble opinion, device manufacturers should not include WEP support in their new devices to avoid its usage and increase their customers' security. From the security specialist's point of view, WEP should never be used to protect a WLAN, but it can be used for Wi-Fi security training purposes. Regardless of a security type in use, shared keys always add the additional security risk; users often tend to share keys, thus increasing the risk of key compromise and reducing accountability for key privacy. Moreover, the more devices use the same key, the greater amount of traffic becomes suitable for an attacker during cryptanalytic attacks, increasing their performance and chances of success. This risk can be minimized by using personal identifiers (key, certificate) for users and devices. WPA/WPA2 Due to numerous WEP security flaws, the next generation of Wi-Fi security mechanism became available in 2003: Wi-Fi Protected Access (WPA). It was announced as an intermediate solution until WPA2 is available and contained significant security improvements over WEP. Those improvements include: Stronger encryption Cryptographic integrity control: WPA uses an algorithm called Michael instead of CRC used in WEP. This is supposed to prevent altering data packets on the fly and protects from resending sniffed packets. Usage of temporary keys: The Temporal Key Integrity Protocol (TKIP) automatically changes encryption keys generated for every packet. This is the major improvement over the static WEP where encryption keys should be entered manually in an AP config. TKIP also operates RC4, but the way how it is used was improved. Support of client authentication: The capability to use dedicated authentication servers for user and device authentication made WPA suitable for use in large enterprise networks. The support of cryptographically strong algorithm Advanced Encryption Standard (AES) was implemented in WPA, but it was not set as mandatory, only optional. Although WPA was a significant improvement over WEP, it was a temporary solution before WPA2 was released in 2004 and became mandatory for all new Wi-Fi devices. WPA2 works very similar to WPA and the main differences between WPA and WPA2 is in the algorithms used to provide security: AES became the mandatory algorithm for encryption in WPA2 instead of default RC4 in WPA TKIP used in WPA was replaced by Counter Cipher Mode with Block Chaining Message Authentication Code Protocol (CCMP) Because of the very similar workflows, WPA and WPA2 are also vulnerable to the similar or the same attacks and usually called and spelled in one word WPA/WPA2. Both WPA and WPA2 can work in two modes: pre-shared key (PSK) or personal mode and enterprise mode. Pre-shared key mode Pre-shared key or personal mode was intended for home and small office use where networks have low complexity. We are more than sure that all our readers have met this mode and that most of you use it at home to connect your laptops, mobile phones, tablets, and so on to home networks. The general idea of PSK mode is using the same secret key on an access point and on a client device to authenticate the device and establish an encrypted connection for networking. The process of WPA/WPA2 authentication using a PSK consists of four phases and is also called a 4-way handshake. It is depicted in the following diagram: WPA/WPA2 4-way handshake The main WPA/WPA2 flaw in PSK mode is the possibility to sniff a whole 4-way handshake and brute-force a security key offline without any interaction with a target WLAN. Generally, the security of a WLAN mostly depends on the complexity of a chosen PSK. Computing a PMK (short for primary master key) used in 4-way handshakes (refer to the handshake diagram) is a very time-consuming process compared to other computing operations and computing hundreds of thousands of them can take very long. But in case of a short and low complexity PSK in use, a brute-force attack does not take long even on a not-so-powerful computer. If a key is complex and long enough, cracking it can take much longer, but still there are ways to speed up this process: Using powerful computers with CUDA (short for Compute Unified Device Architecture), which allows a software to directly communicate with GPUs for computing. As GPUs are natively designed to perform mathematical operations and do it much faster than CPUs, the process of cracking works several times faster with CUDA. Using rainbow tables that contain pairs of various PSKs and their corresponding precomputed hashes. They save a lot of time for an attacker because the cracking software just searches for a value from an intercepted 4-way handshake in rainbow tables and returns a key corresponding to the given PMK if there was a match instead of computing PMKs for every possible character combination. Because WLAN SSIDs are used in 4-way handshakes analogous to a cryptographic salt, PMKs for the same key will differ for different SSIDs. This limits the application of rainbow tables to a number of the most popular SSIDs. Using cloud computing is another way to speed up the cracking process, but it usually costs additional money. The more computing power can an attacker rent (or get through another ways), the faster goes the process. There are also online cloud-cracking services available in Internet for various cracking purposes including cracking 4-way handshakes. Furthermore, as with WEP, the more users know a WPA/WPA2 PSK, the greater the risk of compromise—that's why it is also not an option for big complex corporate networks. WPA/WPA2 PSK mode provides the sufficient level of security for home and small office networks only when a key is long and complex enough and is used with a unique (or at least not popular) WLAN SSID. Enterprise mode As it was already mentioned in the previous section, using shared keys itself poses a security risk and in case of WPA/WPA2 highly relies on a key length and complexity. But there are several factors in enterprise networks that should be taken into account when talking about WLAN infrastructure: flexibility, manageability, and accountability. There are various components that implement those functions in big networks, but in the context of our topic, we are mostly interested in two of them: AAA (short for authentication, authorization, and accounting) servers and wireless controllers. WPA-Enterprise or 802.1x mode was designed for enterprise networks where a high security level is needed and the use of AAA server is required. In most cases, a RADIUS server is used as an AAA server and the following EAP (Extensible Authentication Protocol) types are supported (and several more, depending on a wireless device) with WPA/WPA2 to perform authentication: EAP-TLS EAP-TTLS/MSCHAPv2 PEAPv0/EAP-MSCHAPv2 PEAPv1/EAP-GTC PEAP-TLS EAP-FAST You can find a simplified WPA-Enterprise authentication workflow in the following diagram: WPA-Enterprise authentication Depending on an EAP-type configured, WPA-Enterprise can provide various authentication options. The most popular EAP-type (based on our own experience in numerous pentests) is PEAPv0/MSCHAPv2, which is relatively easily integrated with existing Microsoft Active Directory infrastructures and is relatively easy to manage. But this type of WPA-protection is relatively easy to defeat by stealing and bruteforcing user credentials with a rogue access point. The most secure EAP-type (at least, when configured and managed correctly) is EAP-TLS, which employs certificate-based authentication for both users and authentication servers. During this type of authentication, clients also check server's identity and a successful attack with a rogue access point becomes possible only if there are errors in configuration or insecurities in certificates maintenance and distribution. It is recommended to protect enterprise WLANs with WPA-Enterprise in EAP-TLS mode with mutual client and server certificate-based authentication. But this type of security requires additional work and resources. WPS Wi-Fi Protected Setup (WPS) is actually not a security mechanism, but a key exchange mechanism which plays an important role in establishing connections between devices and access points. It was developed to make the process of connecting a device to an access point easier, but it turned out to be one of the biggest wholes in modern WLANs if activated. WPS works with WPA/WPA2-PSK and allows connecting devices to WLANs with one of the following methods: PIN: Entering a PIN on a device. A PIN is usually printed on a sticker at the back side of a Wi-Fi access point. Push button: Special buttons should be pushed on both an access point and a client device during the connection phase. Buttons on devices can be physical and virtual. NFC: A client should bring a device close to an access point to utilize the Near Field Communication technology. USB drive: Necessary connection information exchange between an access point and a device is done using a USB drive. Because WPS PINs are very short and their first and second parts are validated separately, online brute-force attack on a PIN can be done in several hours allowing an attacker to connect to a WLAN. Furthermore, the possibility of offline PIN cracking was found in 2014 which allows attackers to crack pins in 1 to 30 seconds, but it works only on certain devices. You should also not forget that a person who is not permitted to connect to a WLAN but who can physically access a Wi-Fi router or access point can also read and use a PIN or connect via push button method. Getting familiar with the Wi-Fi attack workflow In our opinion (and we hope you agree with us), planning and building a secure WLAN is not possible without the understanding of various attack methods and their workflow. In this topic, we will give you an overview of how attackers work when they are hacking WLANs. General Wi-Fi attack methodology After refreshing our knowledge about wireless threats and Wi-Fi security mechanisms, let's have a look at the attack methodology used by attackers in the real world. Of course, as with all other types of network attack, wireless attack workflows depend on certain situations and targets, but they still align with the following general sequence in almost all cases: The first step is planning. Normally, attackers need to plan what are they going to attack, how can they do it, which tools are necessary for the task, when is the best time and place to attack certain targets, and which configuration templates will be useful so that they can be prepared in advance. White-hat hackers or penetration testers need to set schedules and coordinate project plans with their customers, choose contact persons on a customer side, define project deliverables, and do some other organizational work if demanded. As with every penetration testing project, the better a project was planned (and we can use the word "project" for black-hat hackers' tasks), the more chances of a successful result. The next step is survey. Getting as accurate as possible and as much as possible information about a target is crucial for a successful hack, especially in uncommon network infrastructures. To hack a WLAN or its wireless clients, an attacker would normally collect at least SSIDs or MAC addresses of access points and clients and information about a security type in use. It is also very helpful for an attacker to understand if WPS is enabled on a target access point. All that data allows attackers not only to set proper configs and choose the right options for their tools, but also to choose appropriate attack types and conditions for a certain WLAN or Wi-Fi client. All collected information, especially non-technical (for example, company and department names, brands, or employee names), can also become useful at the cracking phase to build dictionaries for brute-force attacks. Depending on a type of security and attacker's luck, a data collected at the survey phase can even make the active attack phase unnecessary and allow an attacker to proceed directly with the cracking phase. The active attacks phase involves active interaction between an attacker and targets (WLANs and Wi-Fi clients). At this phase, attackers have to create conditions necessary for a chosen attack type and execute it. It includes sending various Wi-Fi management and control frames and installing rogue access points. If an attacker wants to cause a denial of service in a target WLAN as a goal, such attacks are also executed at this phase. Some of active attacks are essential to successfully hack a WLAN, but some of them are intended to just speed up hacking and can be omitted to avoid causing alarm on various wireless intrusion detection/prevention systems (wIDPS), which can possibly be installed in a target network. Thus, the active attacks phase can be called optional. Cracking is another important phase where an attacker cracks 4-way-hadshakes, WEP data, NTLM hashes, and so on, which were intercepted at the previous phases. There are plenty of various free and commercial tools and services including cloud cracking services. In case of success at this phase, an attacker gets the target WLAN's secret(s) and can proceed with connecting to the WLAN, decrypt intercepted traffic, and so on. The active attacking phase Let's have a closer look to the most interesting parts of the active attack phase—WPA-PSK and WPA-Enterprise attacks—in the following topics. WPA-PSK attacks As both WPA and WPA2 are based on the 4-way handshake, attacking them doesn't differ—an attacker needs to sniff a 4-way handshake in the moment, establishing a connection between an access point and an arbitrary wireless client and bruteforcing a matching PSK. It does not matter whose handshake is intercepted, because all clients use the same PSK for a given target WLAN. Sometimes, attackers have to wait long until a device connects to a WLAN to intercept a 4-way handshake and of course they would like to speed up the process when possible. For that purpose, they force already connected device to disconnect from an access point sending control frames (deauthentication attack) on behalf of a target access point. When a device receives such a frame, it disconnects from a WLAN and tries to reconnect again if the "automatic reconnect" feature is enabled (it is enabled by default on most devices), thus performing another one 4-way handshake that can be intercepted by an attacker. Another possibility to hack a WPA-PSK protected network is to crack a WPS PIN if WPS is enabled on a target WLAN. Enterprise WLAN attacks Attacking becomes a little bit more complicated if WPA-Enterprise security is in place, but could be executed in several minutes by a properly prepared attacker by imitating a legitimate access point with a RADIUS server and by gathering user credentials for further analysis (cracking). To settle this attack, an attacker needs to install a rogue access point with a SSID identical to a target WLAN's SSID and set other parameter (like EAP type) similar to the target WLAN to increase chances on success and reduce the probability of the attack to be quickly detected. Most user Wi-Fi devices choose an access point for a connection to a certain WLAN by a signal strength—they connect to that one which has the strongest signal. That is why an attacker needs to use a powerful Wi-Fi interface for a rogue access point to override signals from legitimate ones and make devices around connect to the rogue access point. A RADIUS server used during such attacks should have a capability to record authentication data, NTLM hashes, for example. From a user perspective, being attacked in such way looks just being unable to connect to a WLAN for an unknown reason and could even be not seen if a user is not using a device at the moment and just passing by a rogue access point. It is worth mentioning that classic physical security or wireless IDPS solutions are not always effective in such cases. An attacker or a penetration tester can install a rogue access point outside of the range of a target WLAN. It will allow hacker to attack user devices without the need to get into a physically controlled area (for example, office building), thus making the rogue access point unreachable and invisible for wireless IDPS systems. Such a place could be a bus or train station, parking or a café where a lot of users of a target WLAN go with their Wi-Fi devices. Unlike WPA-PSK with only one key shared between all WLAN users, the Enterprise mode employs personified credentials for each user whose credentials could be more or less complex depending only on a certain user. That is why it is better to collect as many user credentials and hashes as possible thus increasing the chances of successful cracking. Summary In this article, we looked at the security mechanisms that are used to secure access to wireless networks, their typical threads, and common misconfigurations that lead to security breaches and allow attacker to harm corporate and private wireless networks. The brief attack methodology overview has given us a general understanding of how attackers normally act during wireless attacks and how they bypass common security mechanisms by abusing certain flaws in those mechanisms. We also saw that the most secure and preferable way to protect a wireless network is to use WPA2-Enterprise security along with a mutual client and server authentication, which we are going to implement in our penetration testing lab. Resources for Article: Further resources on this subject: Pentesting Using Python [article] Advanced Wireless Sniffing [article] Wireless and Mobile Hacks [article]

0
0
9789

How-To Tutorials

Packt

31 Jul 2013

6 min read

Quick start – creating your first grid

Packt

31 Jul 2013

6 min read

(For more resources related to this topic, see here.) So, let's do that right now for this example. Create a copy of the entire jqGrid folder named quickstart; easy as pie. Let's begin by creating the simplest grid possible; open up the index.html file from the newly created quickstart folder, and modify the body section to look like the following code: <body><table id="quickstart_grid"></table><script>var dataArray = [{name: 'Bob', phone: '232-532-6268'},{name: 'Jeff', phone: '365-267-8325'}];$("#quickstart_grid").jqGrid({datatype: 'local',data: dataArray,colModel: [{name: 'name', label: 'Name'},{name: 'phone', label: 'Phone Number'}]});</script></body> The first element in the body tags is a standard table element; this is what will be converted into our grid, and it's literally all the HTML needed to make one. The first four lines are just defining some data. I didn't want to get into AJAX or opening other files, so I decided to just create a simple JavaScript array with two entries. The next line is where the magic happens; this single command is being used to create and populate a grid using the information provided. We select the table element with jQuery, and call the jqGrid function, passing it all the properties needed to make a grid. The first two options set the data along with its type, in our case the data is the array we made and the type is local, which is in contrast to some of the other datatypes which use AJAX to retrieve remote data. After that we set the columns, this is done with the colModel property. Now, there are a lot of options for the colModel property, which we will get to later on, numerous settings for customizing and manipulating the data in the table. But for this simple example, we are just specifying the name and label properties, which tell jqGrid the column's label for the header and the value's key from the data array. Now, open index.html in your browser and you should see something like the following screenshot: Not particularly pretty, but you can see that with just a few short lines, we have created a grid and populated it with data that can be sorted. But we can do better; first off, we are only using two of the four standard layers we talked about: the header layer and the body layer. Let's add a caption layer to provide little context, and let's adjust the size of the grid to fit our data. So, modify the call to jqGrid with the following: $("#grid").jqGrid({datatype: 'local',data: dataArray,colModel: [{name: 'name', label: 'Name'},{name: 'phone', label: 'Phone Number'}],height: 'auto',caption: 'Users Grid',}); And refresh your browser; your grid should now look like the following screenshot: That's looking much better. Now, you may be wondering, we only set the height property to auto, so how come the width seems to have snapped to the content? This is due to the fact that the right margin we saw earlier is actually a column for the scroll bar. By default jqGrid sets your grid's height to 150 pixels, this means, regardless of whether you have only one row, or a thousand rows, the height will remain the same, so that there is a gap to hold the scroll bar in an event when you have more rows than that would fit in the given space. When we set the height to auto, it will stretch the grid vertically to contain all the items, making the scroll bar irrelevant and therefore it knows not to place it. Now this is a pretty good quick start example, but to finish things off, let's take a look at the navigation layer, just so we can say we did. For this next part, although we are going to need more data, I can't really show pagination with just two entries, luckily there is a site http://www.json-generator.com/ created by Vazha Omanashvili for doing exactly this. The way it works is, you specify the format and number of rows you want, and it generates it with random data. We are going to keep the format we have been using, of name and phone number, so in the box on the left, enter the following code: ['{{repeat(50)}}',{name: '{{firstName}} {{lastName}}',phone: '{{phone}}'}] Here we're saying we would like 50 rows with a name field containing both firstname and lastname, and we would also like a phone number. This site is not really in the scope of this book, but if you would like to know about other fields, you can click on the help button at the top. Click on Generate to produce a result object, and then just click on the copy to clipboard button. With that done, open your index.html file and replace the array for dataArray with what you copied. Your code should now look like the following code: var dataArray = {"id": 1,"jsonrpc": "2.0","total": 50,"result": [ /* the 50 rows */ ]}; As you can see, the actual rows are under a property named result, so we will need to change the data key in the call to jqGrid from dataArray to dataArray.result. Refreshing the page now you will see the first 20 rows being displayed (that is the default limit). But how can we get to the rest? Well, jqGrid provides a special navigation layer named "pager", which contains a pagination control. To display it, we will need to create an HTML element for it. So, right underneath the table element add a div like: <table id="grid"></table><div id="pager"></div> Then, we just need to add a key to the jqGrid method for the pager and row limit: caption: 'Users Grid',height: 'auto',rowNum: 10,pager: '#pager'}); And that's all there to it, you can adjust the rowNum property to display more or less entries at once, and the pages will be automatically calculated for you. Summary In this article we learned how to create a simple Grid and customize it. We used different properties and functions for the same. We also used different layers to customize the jqGrid. Resources for Article : Further resources on this subject: jQuery and Ajax Integration in Django [Article] Implementing AJAX Grid using jQuery data grid plugin jqGrid [Article] Django JavaScript Integration: jQuery In-place Editing Using Ajax [Article]

0
0
9785

How-To Tutorials

article-image-installing-openvpn-linux-and-unix-systems-part-1

Packt

31 Dec 2009

10 min read

Installing OpenVPN on Linux and Unix Systems: Part 1

Packt

31 Dec 2009

10 min read

0
0
9773

Packt

27 Apr 2015

9 min read

Apache Solr and Big Data – integration with MongoDB

Packt

27 Apr 2015

9 min read

In this article by Hrishikesh Vijay Karambelkar, author of the book Scaling Big Data with Hadoop and Solr - Second Edition, we will go through Apache Solr and MongoDB together. In an enterprise, data is generated from all the software that is participating in day-to-day operations. This data has different formats, and bringing in this data for big-data processing requires a storage system that is flexible enough to accommodate a data with varying data models. A NoSQL database, by its design, is best suited for this kind of storage requirements. One of the primary objectives of NoSQL is horizontal scaling, that is, the P in CAP theorem, but this works at the cost of sacrificing Consistency or Availability. Visit http://en.wikipedia.org/wiki/CAP_theorem to understand more about CAP theorem (For more resources related to this topic, see here.) What is NoSQL and how is it related to Big Data? As we have seen, data models for NoSQL differ completely from that of a relational database. With the flexible data model, it becomes very easy for developers to quickly integrate with the NoSQL database, and bring in large sized data from different data sources. This makes the NoSQL database ideal for Big Data storage, since it demands that different data types be brought together under one umbrella. NoSQL also has different data models, like KV store, document store and Big Table storage. In addition to flexible schema, NoSQL offers scalability and high performance, which is again one of the most important factors to be considered while running big data. NoSQL was developed to be a distributed type of database. When traditional relational stores rely on the high computing power of CPUs and the high memory focused on a centralized system, NoSQL can run on your low-cost, commodity hardware. These servers can be added or removed dynamically from the cluster running NoSQL, making the NoSQL database easier to scale. NoSQL enables most advanced features of a database, like data partitioning, index sharding, distributed query, caching, and so on. Although NoSQL offers optimized storage for big data, it may not be able to replace the relational database. While relational databases offer transactional (ACID), high CRUD, data integrity, and a structured database design approach, which are required in many applications, NoSQL may not support them. Hence, it is most suited for Big Data where there is less possibility of need for data to be transactional. MongoDB at glance MongoDB is one of the popular NoSQL databases, (just like Cassandra). MongoDB supports the storing of any random schemas in the document oriented storage of its own. MongoDB supports the JSON-based information pipe for any communication with the server. This database is designed to work with heavy data. Today, many organizations are focusing on utilizing MongoDB for various enterprise applications. MongoDB provides high availability and load balancing. Each data unit is replicated and the combination of a data with its copes is called a replica set. Replicas in MongoDB can either be primary or secondary. Primary is the active replica, which is used for direct read-write operations, while the secondary replica works like a backup for the primary. MongoDB supports searches by field, range queries, and regular expression searches. Queries can return specific fields of documents and also include user-defined JavaScript functions. Any field in a MongoDB document can be indexed. More information about MongoDB can be read at https://www.mongodb.org/. The data on MongoDB is eventually consistent. Apache Solr can be used to work with MongoDB, to enable database searching capabilities on a MongoDB-based data store. Unlike Cassandra, where the Solr indexes are stored directly in Cassandra through solandra, MongoDB integration with Solr brings in the indexes in the Solr-based optimized storage. There are various ways in which the data residing in MongoDB can be analyzed and searched. MongoDB's replication works by recording all operations made on a database in a log file, called the oplog (operation log). Mongo's oplog keeps a rolling record of all operations that modify the data stored in your databases. Many of the implementers suggest reading this log file using a standard file IO program to push the data directly to Apache Solr, using CURL, SolrJ. Since oplog is a collection of data with an upper limit on maximum storage, it is feasible to synch such querying with Apache Solr. Oplog also provides tailable cursors on the database. These cursors can provide a natural order to the documents loaded in MongoDB, thereby, preserving their order. However, we are going to look at a different approach. Let's look at the schematic following diagram: In this case, MongoDB is exposed as a database to Apache Solr through the custom database driver. Apache Solr reads MongoDB data through the DataImportHandler, which in turns calls the JDBC-based MongoDB driver for connecting to MongoDB and running data import utilities. Since MongoDB supports replica sets, it manages the distribution of data across nodes. It also supports Sharding just like Apache Solr. Installing MongoDB To install MongoDB in your development environment, please follow the following steps: Download the latest version of MongoDB from https://www.mongodb.org/downloads for your supported operating system. Unzip the zipped folder. MongoDB comes up with a default set of different command-line components and utilities: bin/mongod: The database process. bin/mongos: Sharding controller. bin/mongo: The database shell (uses interactive JavaScript). Now, create a directory for MongoDB, which it will use for user data creation and management, and run the following command to start the single node server: $ bin/mongod –dbpath <path to your data directory> --rest In this case, --rest parameter enables support for simple rest APIs that can be used for getting the status. Once the server is started, access http://localhost:28017 from your favorite browser, you should be able to see following administration status page: Now that you have successfully installed MongoDB, try loading a sample data set from the book on MongoDB by opening a new command-line interface. Change the directory to $MONGODB_HOME and run the following command: $ bin/mongoimport --db solr-test --collection zips --file "<file-dir>/samples/zips.json" Please note that the database name is solr-test. You can see the stored data using the MongoDB-based CLI by running the following set of commands from your shell: $ bin/mongo MongoDB shell version: 2.4.9 connecting to: test Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user > use test Switched to db test > show dbs exampledb 0.203125GB local 0.078125GB test 0.203125GB > db.zips.find({city:"ACMAR"}) { "city" : "ACMAR", "loc" : [ -86.51557, 33.584132 ], "pop" : 6055, "state" :"AL", "_id" : "35004" } Congratulations! MongoDB is installed successfully Creating Solr indexes from MongoDB To run MongoDB as a database, you will need a JDBC driver built for MongoDB. However, the Mongo-JDBC driver has certain limitations, and it does not work with the Apache Solr DataImportHandler. So, I have extended Mongo-JDBC to work under the Solr-based DataImportHandler. The project repository is available at https://github.com/hrishik/solr-mongodb-dih. Let's look at the setting-up procedure for enabling MongoDB based Solr integration: You may not require a complete package from the solr-mongodb-dih repository, but just the jar file. This can be downloaded from https://github.com/hrishik/solr-mongodb-dih/tree/master/sample-jar. You will also need the following additional jar files: jsqlparser.jar mongo.jar These jars are available with the book Scaling Big Data with Hadoop and Solr, Second Edition for download. In your Solr setup, copy these jar files into the library path, that is, the $SOLR_WAR_LOCATION/WEB-INF/lib folder. Alternatively, point your container classpath variable to link them up. Using simple Java source code DataLoad.java (link https://github.com/hrishik/solr-mongodb-dih/blob/master/examples/DataLoad.java), populate the database with some sample schema and tables that you will use to load in Apache Solr. Now create a data source file (data-source-config.xml) as follows: <dataConfig> <dataSource name="mongod" type="JdbcDataSource" driver="com.mongodb. jdbc.MongoDriver" url="mongodb://localhost/solr-test"/> <document> <entity name="nameage" dataSource="mongod" query="select name, price from grocery"> <field column="name" name="name"/> <field column="name" name="id"/>  </entity> </document> </dataConfig> Copy the solr-dataimporthandler-*.jar from your contrib directory to a container/application library path. Modify $SOLR_COLLECTION_ROOT/conf/solr-config.xml with DIH entry:  <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config"><path to config>/data-source-config.xml</str> </lst> </requestHandler>  Once this configuration is done, you are ready to test it out. Access http://localhost:8983/solr/dataimport?command=full-import from your browser to run the full import on Apache Solr, where you will see that your import handler has successfully ran, and has loaded the data in Solr store, as shown in the following screenshot: You can validate the content created by your new MongoDB DIH by accessing the Solr Admin page, and running a query: Using this connector, you can perform operations for full-import on various data elements. Since MongoDB is not a relational database, it does support join queries. However, it supports selects, order by, and so on. Summary In this article, we have understood the distributed aspects of any enterprise search where went through Apache Solr and MongoDB together. Resources for Article: Further resources on this subject: Evolution of Hadoop [article] In the Cloud [article] Learning Data Analytics with R and Hadoop [article]

0
0
9767

article-image-analyzing-moby-dick-frequency-distribution-nltk

Richard Gall

30 Mar 2018

2 min read

Analyzing Moby Dick through frequency distribution with NLTK

Richard Gall

30 Mar 2018

2 min read

What is frequency distribution and why does it matter? In the context of natural language processing, frequency distribution is simply a tally of the number of times each unique word is used in a text. Recording the individual word counts of a text can better help us understand not only what topics are being discussed and what information is important but also how that information is being discussed as well. It's a useful method for better understanding language and different types of texts. This video tutorial has been taken from from Natural Language Processing with Python. Word frequency distribution is central to performing content analysis with NLP. Its applications are wide ranging. From understanding and characterizing an author’s writing style to analyzing the vocabulary of rappers, the technique is playing a large part in wider cultural conversations. It’s also used in psychological research in a number of ways to analyze how patients use language to form frameworks for thinking about themselves and the world. Trivial or serious, word frequency distribution is becoming more and more important in the world of research. Of course, manually creating such a word frequency distribution models would be time consuming and inconvenient for data scientists. Fortunately for us, NLTK, Python’s toolkit for natural language processing, makes life much easier. How to use NLTK for frequency distribution Take a look at how to use NLTK to create a frequency distribution for Herman Melville’s Moby Dick in the video tutorial above. In it, you'll find a step by step guide to performing an important data analysis task. Once you've done that, you can try it for yourself, or have a go at performing a similar analysis on another data set. Read Analyzing Textual information using the NLTK library. Learn more about natural language processing - read How to create a conversational assistant or chatbot using Python.

0
0
9766

article-image-mastering-centos-7-linux-server-0

Packt

30 Dec 2015

19 min read

Mastering CentOS 7 Linux Server

Packt

30 Dec 2015

19 min read

0
0
9745

article-image-chatgpt-for-natural-language-processing-nlp

Bhavishya Pandit

25 Sep 2023

10 min read

ChatGPT for Natural Language Processing (NLP)

Bhavishya Pandit

25 Sep 2023

10 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionIn an era defined by the fusion of technology and human interaction, ChatGPT stands at the forefront as a groundbreaking creation. This marvel of machine learning, developed by OpenAI, has transcended mere algorithms to become a conversational AI that possesses the ability to engage, assist, and inspire. As a professional writer deeply immersed in both the realms of language and artificial intelligence, I am excited to delve into the capabilities of ChatGPT and explore its potential impact on a world increasingly reliant on Natural Language Processing (NLP). In this article, we will not only unveil the astonishing abilities of ChatGPT but also shed light on the burgeoning significance of NLP across diverse industries.Accessing GPT APIThe ChatGPT API provides a streamlined way to integrate the power of ChatGPT into applications and services. It operates through a simple yet effective mechanism: users send a list of messages as input, with each message having a 'role' (system, user, or assistant) and 'content' (the text of the message). The conversation typically begins with a system message to set the AI's behavior, followed by alternating user and assistant messages.The API returns a model-generated message as output, which can be easily extracted from the API response. To access this functionality, developers can obtain API keys through the OpenAI platform. These keys grant access to the API, enabling developers to harness the capabilities of ChatGPT within their applications and projects seamlessly.ChatGPT for various NLP tasks1. Sentiment Analysis with ChatGPTUsing ChatGPT for sentiment analysis is a straightforward yet powerful application. To perform sentiment analysis, you can send a message to ChatGPT with user or assistant roles and ask it to determine the sentiment of a piece of text. Here's an example in Python using the OpenAI Python library:import openai openai.api_key = "YOUR_API_KEY" def analyze_sentiment(text): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": f"Analyze the sentiment of the following text: '{text}'"} ] ) sentiment = response['choices'][0]['message']['content'] return sentiment text_to_analyze = "I absolutely love this product!" sentiment_result = analyze_sentiment(text_to_analyze) print(f"Sentiment: {sentiment_result}") Potential Applications:1. Social Media Monitoring: ChatGPT's sentiment analysis can be invaluable for businesses and brands aiming to track public sentiment about their products or services on social media platforms. By analyzing user-generated content, companies can gain real-time insights into how their brand is perceived and promptly respond to both positive and negative feedback.2. Customer Feedback Analysis: ChatGPT can assist in automating the process of analyzing customer reviews and feedback. It can categorize comments as positive, negative, or neutral, helping businesses identify areas for improvement and understand customer sentiment more comprehensively.3. Market Research: Researchers can leverage ChatGPT's sentiment analysis capabilities to process large volumes of text data from surveys, focus groups, or online forums. This aids in identifying emerging trends, gauging public opinion, and making data-driven decisions.By integrating ChatGPT's sentiment analysis into these and other applications, organizations can harness the power of natural language understanding to gain deeper insights into the opinions, emotions, and attitudes of their audience, leading to more informed and effective decision-making.2. Language Translation with ChatGPTChatGPT can be harnessed for language translation tasks with ease. It's a versatile tool for converting text from one language to another. Here's a Python code example demonstrating how to use ChatGPT for language translation:import openai openai.api_key = "YOUR_API_KEY" def translate_text(text, source_language, target_language): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": f"Translate the following text from {source_language} to {target_language}: '{text}'"} ] ) translation = response['choices'][0]['message']['content'] return translation source_text = "Hello, how are you?" source_language = "English" target_language = "French" translated_text = translate_text(source_text, source_language, target_language) print(f"Translated Text: {translated_text}") Relevance in Multilingual Content Creation and Internationalization:1. Multilingual Content Creation: In an increasingly globalized world, businesses and content creators need to reach diverse audiences. ChatGPT's language translation capabilities facilitate the creation of multilingual content, enabling companies to expand their market reach and engage with customers in their native languages. This is crucial for marketing campaigns, websites, and product documentation.2. Internationalization: For software and apps aiming to go international, ChatGPT can assist in translating user interfaces and content into multiple languages. This enhances the user experience and makes products more accessible to a global user base.3. Cross-Cultural Communication: ChatGPT can help bridge language barriers in real-time conversations, facilitating cross-cultural communication. This is beneficial in customer support, online chat, and international business negotiations.By leveraging ChatGPT's language translation capabilities, organizations and individuals can enhance their global presence, foster better communication across languages, and tailor their content to a diverse and international audience. This, in turn, can lead to increased engagement, improved user satisfaction, and broader market opportunities.3. Text Summarization with ChatGPTChatGPT can be a valuable tool for generating concise and coherent text summaries from lengthy articles or documents. It leverages its natural language processing capabilities to extract the most important information and present it in a condensed form. Here's a Python code example illustrating how to use ChatGPT for text summarization:import openai openai.api_key = "YOUR_API_KEY" def generate_summary(text, max_tokens=50): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": f"Summarize the following text: '{text}'", "role": "assistant", "content": f"Please summarize the following text to around {max_tokens} tokens:"} ] ) summary = response['choices'][0]['message']['content'] return summary document_text = SAMPLE_TEXT summary_result = generate_summary(document_text) print(f"Summary: {summary_result}")Applications in Content Curation and Information Extraction:1. Content Curation: Content creators, marketers, and news aggregators can use ChatGPT to automatically summarize news articles, blog posts, or research papers. This streamlines the process of identifying relevant and interesting content to share with their audience.2. Research and Study: Researchers and students can employ ChatGPT to condense lengthy academic papers or reports into more manageable summaries. This helps in quickly grasping the key findings and ideas within complex documents.3. Business Intelligence: In the corporate world, ChatGPT can be employed to summarize market reports, competitor analyses, and industry trends. This enables executives and decision-makers to stay informed and make strategic choices more efficiently.By integrating ChatGPT's text summarization capabilities into various applications, users can enhance their ability to sift through and distill vast amounts of textual information, ultimately saving time and improving decision-making processes.4. Question Answering with ChatGPTChatGPT excels at answering questions, making it a versatile tool for building chatbots, virtual assistants, and FAQ systems. It can provide informative and context-aware responses to a wide range of queries. Here's a Python code example illustrating how to use ChatGPT for question answering:import openai openai.api_key = "YOUR_API_KEY" def ask_question(question, context): response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": f"Context: {context}"}, {"role": "user", "content": f"Question: {question}"} ] ) answer = response['choices'][0]['message']['content'] return answer context = "The Eiffel Tower is a famous landmark in Paris, France. It was completed in 1889 and stands at 324 meters tall." question = "When was the Eiffel Tower built?" answer_result = ask_question(question, context) print(f"Answer: {answer_result}")Use in Chatbots, FAQs, and Virtual Assistants:1. Chatbots: ChatGPT can serve as the core intelligence behind chatbots, responding to user inquiries and engaging in natural conversations. Businesses can use chatbots for customer support, lead generation, and interactive marketing, delivering real-time assistance to users.2. FAQ Systems: Implementing ChatGPT in FAQ systems allows users to ask questions in a more natural and conversational manner. It ensures that users receive accurate and context-aware responses from a repository of frequently asked questions.3. Virtual Assistants: Virtual assistants powered by ChatGPT can assist users in various tasks, such as scheduling appointments, providing information, and even helping with language translation or summarization. They can be integrated into websites, applications, or devices to enhance user experiences.By harnessing ChatGPT's question-answering capabilities, organizations can create intelligent and responsive digital agents that deliver efficient and accurate information to users, improving customer satisfaction and user engagement across a wide range of applications.Ethical ConsiderationsAI and NLP technologies, like ChatGPT, raise ethical concerns, primarily concerning bias and misuse. Biases in training data can lead to unfair or discriminatory responses, while misuse can involve generating harmful content or misinformation. To responsibly use ChatGPT, consider:1. Bias Mitigation: Carefully curate and review training data to minimize biases. Implement debiasing techniques and provide guidelines for human reviewers to ensure fairness.2. Transparency: Be transparent about the AI's capabilities and limitations. Avoid giving it false identities or promoting misleading information.3. Content Moderation: Implement strong content moderation to prevent misuse. Regularly monitor and fine-tune the AI's responses to ensure they align with ethical standards.4. User Education: Educate users on the nature of AI-generated content, promoting critical thinking and responsible consumption.By proactively addressing these ethical concerns and adhering to guidelines, we can harness AI and NLP technologies like ChatGPT for positive, inclusive, and responsible outcomes.ConclusionIn conclusion, ChatGPT is a remarkable AI tool that showcases the transformative potential of Natural Language Processing (NLP). Key takeaways include its capabilities in sentiment analysis, language translation, text summarization, question answering, and chatbot development. However, ethical considerations like bias and misuse are critical and must be addressed responsibly. I encourage readers to harness ChatGPT and NLP in their projects, emphasizing transparency, bias mitigation, and responsible usage. By doing so, we can unlock the vast possibilities of these technologies while fostering fairness, accuracy, and positive impact across various domains. Explore, innovate, and shape a future where language and AI empower us all.Author BioBhavishya Pandit is a Data Scientist at Rakuten! He has been extensively exploring GPT to find use cases and build products that solve real-world problems.

0
0
9744

Packt

29 Aug 2013

20 min read

Design Tools and Basics

Packt

29 Aug 2013

20 min read

(For more resources related to this topic, see here.) Owning a Makerbot 3D printer means being able to make anything you want at a push of a button, right? 3D printer owners quickly find that while 3D printers have no end of things they can produce, they also are not without their limitations. Designing an object without 3D printing in mind will result in a failed print that more resembles a bird nest or a bowl of spaghetti. Making a 3D printable object requires learning a few rules, some careful planning, and design. But once you know the rules the results can be astounding. 3D printers can even produce things with ease that traditional manufacturing cannot, for example, objects with complex internal geometry that machining cannot touch. There are many places online such as Makerbot's own Thingiverse that hosts a daily growing library of printable objects. Printing out other people's designs is all well and good for a while, but the most exciting part about 3D printing is that it can produce your designs and models. Eventually, learning how to model for 3D printing is a must. Can you learn 3D modeling? If you've ever won a round of Pictionary you've got all the artistic skill it takes to get started. If you've ever gotten past level 1 on Tetris then you've got spatial reasoning. If you've ever played with modeling clay then you know all about designing in three dimensions. Design basics There are some design rules and basic ideas that will be true regardless of the modeling software used. The working of 3D printing 3D printing has come a long way in terms of technology and cost allowing home 3D printers to be a reality. In this process there have been choices that will limit what can be printed. Seeing a 3D printer in action is the best way to learn about the process. Fortunately there are many 3D printing time lapse videos online of printers in action that can be found with a simple search. 3D printers build an object layer-by-layer from the bottom to the top. Plastic filament is heated and extruded, and each layer is built upon the last one. Usually the outside of the object is drawn and sometimes additional shells are added for strength. Then the inside is usually filled with a lattice to save plastic and provide some support for higher layers, however the inside is mostly air. This continues until the object is complete as shown in the following screenshot: Because of this layer-by-layer process, if a design is made so that any part has nothing underneath it, dangling in the air, then the printer will still extrude some plastic to try to print the part which will just dangle from the nozzle and be dragged into the next area where it will build up an ugly mess and ruin the print: Building for supportless prints One way of fixing the dangling object problem is to configure the preparing software to build the model "with supports". This means the slicer will automatically build a support lattice of plastic, up to the dangling part so that it has something to print on. Higher-end printers can actually print with a different material that can be dissolved away, but so far most home printers only use break-away supports. Either way after the print is complete it is left to the user to clean up this support material to extract the desired part. While supports do allow the creation of objects that would be impossible any other way, the supports themselves are a waste of material and often don't remove cleanly leading to a messy bottom surface where they contact the print. If a part is designed needing supports that are hard to remove, such as if they're internal and partially obscured, it can be difficult and frustrating to completely remove the support material (this can be true for even the higher-end 3D printers). The process of removing it may actually damage the print. It is possible and very easy with just the slightest application of cleverness to make designs that are printable without the need for any supports. So the blueprints in this article focus on making designs that print without supports. The limitations imposed by this demands just a little more effort but allow for the teaching of principles that are generally good to know. Designing for dual extruders Some models of Makerbot and other 3D printers have the ability to print in multiple colors at once using two different extruder heads feeding plastic from two different spools. There are some fun prints that come from this process. But as most Makerbots and other brands of home 3D printers do not have dual extruders at this time this article will not explore this process in detail. The basic idea of the process is creating two files that are aligned to print in the same space and combining them in the slicer. Designing supportless – overhangs and bridges When designing for supportless printing the rules are simple: Y prints, H prints okay, T does not print well. Branching out with overhangs It is possible to have the current layer slightly larger than the previous layer provided the overhang is not more than 45 degrees. This is because the current layer will have enough of the previous layer to stick to. Hence a shape like the capital letter Y will successfully print standing up. However, if the overhang is too great or too abrupt the new layer will droop causing a print fail, hence a shape like the capital letter T does not print. (If the T is serif and thus has downward dangling bits, it will fail even worse, as illustrated previously.) So it is important to try to keep overhangs within a 45 degree cone as they go upwards. Building bridges If a part of the print has nothing above it, but has something on either side that it can attach to, then it may be able to bridge the gap. But use caution. The printer makes no special effort in making bridges; they are drawn like any other layer: outline first, then infill. As long as the outline has something to attach to on both sides it should be fine. But if that outline is too complex or contains parts that will print in mid-air, it may not succeed. Being aware of bridges in the design and keeping them simple is the key to successful bridging. Even with a simple bridge some 3D printers need a little bit more calibration to print it well. Hence a shape like the capital letter H will successfully print most of the time. Of course this discussion is purely illustrative of the way overhangs work or fail. In real life if a Y, H, or T needed to be printed the best way to do it would be to lay them down. But for purposes of illustration it still stands that Y prints, H prints okay, T does not. Choosing a modeling tool There are many choices of modeling programs that can be used to produce 3D printable objects. There are many factors including versatility, simplicity, and cost to take into account. A tool with too steep a learning curve can turn off new users to the idea all together. A tool with too limited a set of tools can frustrate a user when they hit the limit. Investing a lot of money into something that doesn't end up going anywhere can be extremely disappointing. So it is important to explore the options. SolidWorks (www.solidworks.com) and other drafting oriented programs can do technical shapes with extreme precision. They include the necessary tools to accurately describe a shape that can be brought into the real work with high fidelity. However these sorts of tools tend to be costly and don't do artistic or organic shapes very well. Their highly technical nature also gives them a steep learning curve. OpenSCAD (www.openscad.org) is free and famous among the people who make 3D printers and can make technically accurate models as well. OpenSCAD also allows the models to be parametric, meaning that by changing a few variables and recalculating a new shape is generated. But OpenSCAD is difficult to use unless the user has a very technical and programmatic mind since the shapes are literally built from lines of code. Zbrush (pixologic.com/zbrush), Sculptris (pixologic.com/sculptris), or Wings3D (www.wings3d.com) are great tools for modeling organic shapes like the kind used in video games or animation. Sculptris and Wings3D are free and are very easy to pick up and use. But these tools lack when precision is necessary. Sketchup (www.sketchup.com) is a great free program with a library of shapes built in ready to import and play with. Its modeling tools are great for precise or architectural models. Sketchup doesn't do organic shapes well either and it can be tricky loading the plug-ins necessary for Sketchup to export their models to something printable. Even then models from Sketchup often have to go through an extensive clean up phase before they'll be ready to print. Autodesk 123D (www.123dapp.com) is not one but a whole suite of free programs designed around 3D modeling with specific focus on 3D printing. There are programs to design creatures or precise shapes. There is even an app for converting pictures of real life objects into 3D models. Some are programs that run in browser, some are downloads and some are apps for Apple devices. It's an eclectic and powerful group of programs. The Autodesk 123D suite's weakness is in its general immaturity. Autodesk is making great efforts to make modeling for 3D printing accessible for everyone but its tools still need to mature somewhat before they'll be ready to explore in depth. Blender (blender.org) is a 3D animation program that features a robust set of modeling tools. Good for artistic and organic shapes, it can also be used when precision is needed. On top all that it is free and open source, so it is still in constant development. If Blender doesn't have a particular feature it is only a matter of time until it will be added. If fact by the time this article is published chances are the version of Blender used to make it will already be out of date, but most of Blender's functionality remains unchanged version-to-version. Blender is also completely customizable so that every feature, from the key strokes used to the overall look, can be changed. The downside of Blender is that its user interface is somewhat unintuitive. This causes Blender to have a famously difficult learning curve. Because of Blender's versatility and availability it is the tool of choice for the beginning 3D designer and this work. Installing Blender This will be the first project in the article. Fortunately downloading and installing Blender is as easy as 1-2-3-4. Go to www.blender.org. Click on the Download link. Choose and download the installer for your system: Windows, Mac, or Linux, 32 bit or 64 bit. Run the installer. The installer will guide the process of loading Blender and adding icons to the system. Windows Blender.org offers installer executable and ZIP files. The zip files are for advanced users who want a portable version of Blender. When in doubt choose the executable since it will set up icons making for easy access. If in doubt whether to use the 32 or 64-bit versions picking the 32-bit will insure compatibility, but it is a good idea to find out what type of system it is being installed on as 64-bit offers significant performance improvements. Windows 7 or greater will confirm that the installer should be run. Click on Yes to assure Windows that it's okay to install Blender. Then the installer will run. The install wizard's defaults are fine for most users. Simply put the mouse over the Next button and click on every button that appears under it. On the second screen read over the Blender Terms of Service and click on I Agree to proceed. Unless you manage your installed programs directories yourself it is best to leave the defaults on the third screen as it is. Then click on Next and the install process will start. When the install process finishes leave the check box check and click on Finish to exit the installer and run Blender. Getting acquainted with Blender When Blender starts up, the Splash Screen can be dismissed by left-clicking outside the screen. The screenshots in this article use a custom color palette for print and a smaller window to minimize wasted screen space. Customizing the color palette will be discussed briefly later but these cosmetic changes will not affect the instructions presented at all. The Blender interface is broken into several different customizable panels to keep things organized. Each panel has resizing widgets in the upper-right and lower-left corners. By clicking on these widgets the panels can be split to add more panels or expanded into the territory of another to collapse panels at the user's preference. However, for now the default panels will be discussed as they provide the most common functionality for beginners. The 3D View panel The main window where things will be happening is the 3D View panel. The largest portion of the 3D view consists of the viewport where most of the work will take place. On the left-hand side of the 3D View panel is a tool bar that consists of tools relevant to the selection and mode. If the tool bar is ever not visible it can be revealed (or hidden again) by pressing Tor by clicking on the plus icon on the right-hand side that will appear when the toolbar is hidden. The specific tools in this bar will be explored as they are needed in the projects. There is another plus icon on the right that will bring up the viewport properties with properties relevant to the current selection or the viewport. This can also be revealed or hidden by pressing the Nkey. Again, the specifics will be explored further as needed. At the bottom of the 3D View panel is the 3D view menu bar with additional options followed by menus and icons related to editing and views. Hovering the mouse over each button will show what they are for. The Outliner panel The Outliner panel contains a hieratical view of all the objects in the scene. Each object can be selected by clicking on their name or the object can be hidden, locked, or excluded from rendering. Rendering means making a high quality picture from a scene for things such as animation or presentations. Doing a proper render includes setting up scene lights, cameras, textures, material properties, and many other functions that will not be explored in this article as it does not do anything that helps produce models for printing. However, exploring this functionality outside of this article can be good when trying to show off the models if printing them is not an option. The Properties panel The Properties panel is broken up into many tabs indicated by small icons. Hovering over the icons will show the name of the tab. For modeling the two tabs that will be used the most are the object and modifier tabs. Specific exploration of the tools contained therein will be done as needed. The Info panel On the top of the Blender windows is the Info panel. On the left-hand side of the Info panel there is an easy to navigate menu similar to the menu in most applications. This menu can be collapsed by clicking on the + button next to it and expanded by clicking the same. On the far right of the Info panel there is data about the current scene. If the data cannot be seen, hover the mouse over the panel, and use the middle scroll wheel or click-and-hold the middle mouse button (pressing the wheel like a button) and moving the mouse to pan the panel until the desired data is visible. The Timeline panel The Timeline panel is only relevant to doing animation and can be effectively ignored or collapsed for the purposes of this article. Because this article is only using a limited subset of Blender's functionality some things such as the Timeline panel could be customized away. However, since it is not the focus of this article to tell the reader how to customize their version of Blender, and because Blender has a much broader application, the screenshots that follow will have the Timeline panel visible. The reader is encouraged to explore Blender's other functionalities such as rendering and animation at their leisure and desire. Proper stance While all of Blender's functions are available from buttons and menus on the screen, typical Blender users rely heavily on hotkeys and shortcuts. Already the T and N keys have been discussed to bring up or collapse the Tools and Properties tabs in the 3D view. For this reason it is recommended that to use Blender have one hand on the mouse and the other hand on the keyboard at all times. This tends to be a common stance for many people but is mentioned for the few for whom learning this will be of great help. Blender customization One of Blender's strengths is its customizability. Almost every feature from the look and color, right down to the keystrokes and hotkeys are used for every action in Blender. Customization is accomplished in the Properties menu accessed from the File | User Preferences menu. The buttons across the top, switch to the various categories. Each category is packed with options. A full exploration of these options is beyond the scope of this article but the reader is encouraged to explore these options and make Blender their own. For instance if the reader is using a setup where a middle mouse button is unavailable, Blender contains an option to emulate the middle mouse button by pressing Alt along with left-click. Other systems may require other accommodations many of which are available in this menu. Setting up for Mac OSX Mac OSX users require special consideration. Blender is made for a three button mouse. If a single button mouse is all that is available, click when the instructions say Left-Mouse Button, use Alt with click for Middle-Mouse Button, and press command with click for Right-Mouse Button. General Blender tips Blender employs some conventions that are unique to its environment and as such getting acquainted with its most common quirks early can avoid frustration. First and perhaps most importantly, Ctrl+ Z for undo works in Blender will undo a multitude of mistakes. Undo in Blender remembers many past steps allowing backing up to a point before a grievous error was made. Remembering this when following along with the blueprints that follow will save the reader much frustration. Next, the location of the mouse pointer is important when using hotkeys. For example the T and N keys for the Tools and Properties tabs do not bring up those tabs if the mouse pointer is not hovering over the 3D View. If the mouse is hovering over a different panel the reaction could be unpredictable. Pressing the A key with the mouse over the 3D View will toggle selection of all objects in the scene. Pressing the A key with the mouse over the Object Tools tab will collapse the expandable menu hiding all the options of that menu. Blender uses the right-click on the mouse to select objects by default. This is perhaps the most counter intuitive thing for first time users, particularly because it will be encountered so frequently. But not everything has been swapped, just object selection. This behavior can of course be customized. If the reader would like to customize selection to the left mouse button then it is left to them to adjust the instructions accordingly. Finally, the relation of Blender units to real life units is not by default defined in Blender. Generally it is just easier to remember that 1 grid point will translate to 1 millimeter in the printed object. As the scale is increased Blender inserts darker grid lines every 10 grid lines by default which correlate to centimeters. So the default cube in the default scene would measure 2 mm on each side, which is less than 1/10 of an inch, which is very small. Suggested shortcuts In the projects in this article, when a new idea is introduced it will first be introduced with detailed steps. Once a process is taught the next time the name of the operation and the shortcut for that process will be all that is given. This does not mean that the keyboard shortcuts are the only way to do an operation but they are often the preferred method for experienced Blender users. The reader is free to accomplish the operation in any way that is comfortable for them. The blueprints This article has been designed to teach 3D printing design in a hands-on approach. A series of projects or blueprints will be presented and each one will introduce new tools and techniques. Each one builds on the last. Despite being a "virtual" process, 3D modeling has a surprisingly muscle memory aspect to it. The movements and processes need to be more than a mechanical process being executed, they need to be practiced so they can be fluid and eventually seamless. To that end the reader is advised not to skip any of the blueprints and follow along actually doing each one. The objects being designed in this article are, most of them, very small so that they can be printed without taking too much time or producing too much waste. The reader can make larger versions if they like but that is left for their own challenge activities. Summary 3D printing is cool. Learning to design your own models is the best way to take full advantage of 3D printing today. This article will teach 3D modeling by a series of hands-on activities so it's a good idea not to skip and actually follow along with each blueprint. While home 3D printers have the capability to do break away supports these are messy and wasteful. It is possible to design things to be able to print without the need of any supports. When designing things for support-less 3D prints remember Y prints, H prints okay, T does not print. Keep outward inclines gradual and no more than 45 degrees to be safe. There are many 3D modeling programs to choose from. Some are expensive, some are free. Some are better for technical works, others do artistic or organic shapes better. Some are easy to learn, some take more practice. This article will use Blender since it is free and open source, has tools for modeling technical and organic shapes and is not too difficult to learn if you learn by doing. Blender can be a bit tricky to get started with since it employs some conventions unique to its environment. Blender can be customized but this article will stick with the defaults so everyone is on the same page. Generally remember that Ctrl+ Z undoes a multiple mistakes and can get Blender back to the state it was before, useful in tutorials to get back on track. The location of the mouse pointer is important when using Blender's hotkeys, which is the best way to learn to use Blender. Blender uses the right-click on mouse for selection by default. Finally, Blender's units translate to real life by 1 Blender grid space = 1 millimeter. Resources for Article: Further resources on this subject: Building Your First Bean in ColdFusion [Article] Designer Friendly Templates [Article] The Trivadis Integration Architecture Blueprint [Article]

0
0
9734

How-To Tutorials

Packt

16 Oct 2009

6 min read

Oracle VM Management

Packt

16 Oct 2009

6 min read

0
0
9730

Packt

17 Jul 2014

11 min read

HDFS and MapReduce

Packt

17 Jul 2014

11 min read

0
0
9726

Packt

21 Oct 2012

14 min read

Low-level C# Practices

Packt

21 Oct 2012

14 min read

Working with generics Visual Studio 2005 included .NET version 2.0 which included generics. Generics give developers the ability to design classes and methods that defer the specification of specific parts of a class or method's specification until declaration or instantiation. Generics offer features previously unavailable in .NET. One benefit to generics, that is potentially the most common, is for the implementation of collections that provide a consistent interface to collections of different data types without needing to write specific code for each data type. Constraints can be used to restrict the types that are supported by a generic method or class, or can guarantee specific interfaces. Limits of generics Constraints within generics in C# are currently limited to a parameter-less constructor, interfaces, or base classes, or whether or not the type is a struct or a class (value or reference type). This really means that code within a generic method or type can either be constructed or can make use of methods and properties. Due to these restrictions types within generic types or methods cannot have operators. Writing sequence and iterator members Visual Studio 2005 and C# 2.0 introduced the yield keyword. The yield keyword is used within an iterator member as a means to effectively implement an IEnumerable interface without needing to implement the entire IEnumerable interface. Iterator members are members that return a type of IEnumerable or IEnumerable<T>, and return individual elements in the enumerable via yield return, or deterministically terminates the enumerable via yield break. These members can be anything that can return a value, such as methods, properties, or operators. An iterator that returns without calling yield break has an implied yield break, just as a void method has an implied return. Iterators operate on a sequence but process and return each element as it is requested. This means that iterators implement what is known as deferred execution. Deferred execution is when some or all of the code, although reached in terms of where the instruction pointer is in relation to the code, hasn't entirely been executed yet. Iterators are methods that can be executed more than once and result in a different execution path for each execution. Let's look at an example: public static IEnumerable<DateTime> Iterator() { Thread.Sleep(1000); yield return DateTime.Now; Thread.Sleep(1000); yield return DateTime.Now; Thread.Sleep(1000); yield return DateTime.Now; } The Iterator method returns IEnumerable which results in three DateTime values. The creation of those three DateTime values is actually invoked at different times. The Iterator method is actually compiled in such a way that a state machine is created under the covers to keep track of how many times the code is invoked and is implemented as a special IEnumerable<DateTime> object. The actual invocation of the code in the method is done through each call of the resulting IEnumerator. MoveNext method. The resulting IEnumerable is really implemented as a collection of delegates that are executed upon each invocation of the MoveNext method, where the state, in the simplest case, is really which of the delegates to invoke next. It's actually more complicated than that, especially when there are local variables and state that can change between invocations and is used across invocations. But the compiler takes care of all that. Effectively, iterators are broken up into individual bits of code between yield return statements that are executed independently, each using potentially local shared data. What are iterators good for other than a really cool interview question? Well, first of all, due to the deferred execution, we can technically create sequences that don't need to be stored in memory all at one time. This is often useful when we want to project one sequence into another. Couple that with a source sequence that is also implemented with deferred execution, we end up creating and processing IEnumerables (also known as collections) whose content is never all in memory at the same time. We can process large (or even infinite) collections without a huge strain on memory. For example, if we wanted to model the set of positive integer values (an infinite set) we could write an iterator method shown as follows: static IEnumerable<BigInteger> AllThePositiveIntegers() { var number = new BigInteger(0); while (true) yield return number++; } We can then chain this iterator with another iterator, say something that gets all of the positive squares: static IEnumerable<BigInteger> AllThePostivieIntegerSquares( IEnumerable<BigInteger> sourceIntegers) { foreach(var value in sourceIntegers) yield return value*value; } Which we could use as follows: foreach(var value in AllThePostivieIntegerSquares(AllThePositiveIntegers())) Console.WriteLine(value); We've now effectively modeled two infi nite collections of integers in memory. Of course, our AllThePostiveIntegerSquares method could just as easily be used with fi nite sequences of values, for example: foreach (var value in AllThePostivieIntegerSquares( Enumerable.Range(0, int.MaxValue) .Select(v => new BigInteger(v)))) Console.WriteLine(value); In this example we go through all of the positive Int32 values and square each one without ever holding a complete collection of the set of values in memory. As we see, this is a useful method for composing multiple steps that operate on, and result in, sequences of values. We could have easily done this without IEnumerable<T>, or created an IEnumerator class whose MoveNext method performed calculations instead of navigating an array. However, this would be tedious and is likely to be error-prone. In the case of not using IEnumerable<T>, we'd be unable to operate on the data as a collection with things such as foreach. Context: When modeling a sequence of values that is either known only at runtime, or each element can be reliably calculated at runtime. Practice: Consider using an iterator. Working with lambdas Visual Studio 2008 introduced C# 3.0 . In this version of C# lambda expressions were introduced. Lambda expressions are another form of anonymous functions. Lambdas were added to the language syntax primarily as an easier anonymous function syntax for LINQ queries. Although you can't really think of LINQ without lambda expressions, lambda expressions are a powerful aspect of the C# language in their own right. They are concise expressions that use implicitly-typed optional input parameters whose types are implied through the context of their use, rather than explicit de fi nition as with anonymous methods. Along with C# 3.0 in Visual Studio 2008, the .NET Framework 3.5 was introduced which included many new types to support LINQ expressions, such as Action<T> and Func<T>. These delegates are used primarily as definitions for different types of anonymous methods (including lambda expressions). The following is an example of passing a lambda expression to a method that takes a Func<T1, T2, TResult> delegate and the two arguments to pass along to the delegate: ExecuteFunc((f, s) => f + s, 1, 2); The same statement with anonymous methods: ExecuteFunc(delegate(int f, int s) { return f + s; }, 1, 2); It's clear that the lambda syntax has a tendency to be much more concise, replacing the delegate and braces with the "goes to" operator (=>). Prior to anonymous functions, member methods would need to be created to pass as delegates to methods. For example: ExecuteFunc(SomeMethod, 1, 2); This, presumably, would use a method named SomeMethod that looked similar to: private static int SomeMethod(int first, int second) { return first + second; } Lambda expressions are more powerful in the type inference abilities, as we've seen from our examples so far. We need to explicitly type the parameters within anonymous methods, which is only optional for parameters in lambda expressions. LINQ statements don't use lambda expressions exactly in their syntax. The lambda expressions are somewhat implicit. For example, if we wanted to create a new collection of integers from another collection of integers, with each value incremented by one, we could use the following LINQ statement: var x = from i in arr select i + 1; The i + 1 expression isn't really a lambda expression, but it gets processed as if it were first converted to method syntax using a lambda expression: var x = arr.Select(i => i + 1); The same with an anonymous method would be: var x = arr.Select(delegate(int i) { return i + 1; }); What we see in the LINQ statement is much closer to a lambda expression. Using lambda expressions for all anonymous functions means that you have more consistent looking code. Context: When using anonymous functions. Practice: Prefer lambda expressions over anonymous methods. Parameters to lambda expressions can be enclosed in parentheses. For example: var x = arr.Select((i) => i + 1); The parentheses are only mandatory when there is more than one parameter: var total = arr.Aggregate(0, (l, r) => l + r); Context: When writing lambdas with a single parameter. Practice: Prefer no parenthesis around the parameter declaration. Sometimes when using lambda expressions, the expression is being used as a delegate that takes an argument. The corresponding parameter in the lambda expression may not be used within the right-hand expression (or statements). In these cases, to reduce the clutter in the statement, it's common to use the underscore character (_) for the name of the parameter. For example: task.ContinueWith(_ => ProcessSecondHalfOfData()); The task.ContinueWith method takes an Action <Task> delegate. This means the previous lambda expression is actually given a task instance (the antecedent Task). In our example, we don't use that task and just perform some completely independent operation. In this case, we use (_) to not only signify that we know we don't use that parameter, but also to reduce the clutter and potential name collisions a little bit. Context: When writing lambda expression that take a single parameter but the parameter is not used. Practice: Use underscore (_) for the name of the parameter. There are two types of lambda expressions. So far, we've seen expression lambdas. Expression lambdas are a single expression on the right-hand side that evaluates to a value or void. There is another type of lambda expression called statement lambdas. These lambdas have one or more statements and are enclosed in braces. For example: task.ContinueWith(_ => { var value = 10; value += ProcessSecondHalfOfData(); ProcessSomeRandomValue(value); }); As we can see, statement lambdas can declare variables, as well as have multiple statements. Working with extension methods Along with lambda expressions and iterators, C# 3.0 brought us extension methods. These static methods (contained in a static class whose first argument is modified with the this modifier) were created for LINQ so IEnumerable types could be queried without needing to add copious amounts of methods to the IEnumerable interface. An extension method has the basic form of: public static class EnumerableExtensions { public static IEnumerable<int> IntegerSquares( this IEnumerable<int> source) { return source.Select(value => value * value); } } As stated earlier, extension methods must be within a static class, be a static method, and the first parameter must be modified with the this modifier. Extension methods extend the available instance methods of a type. In our previous example, we've effectively added an instance member to IEnumerable<int> named IntegerSquares so we get a sequence of integer values that have been squared. For example, if we created an array of integer values, we will have added a Cubes method to that array that returns a sequence of the values cubed. For example: var values = new int[] {1, 2, 3}; foreach (var v in values.Cubes()) { Console.WriteLine(v); } Having the ability to create new instance methods that operate on any public members of a specific type is a very powerful feature of the language. This, unfortunately, does not come without some caveats. Extension methods suffer inherently from a scoping problem. The only scoping that can occur with these methods is the namespaces that have been referenced for any given C# source file. For example, we could have two static classes that have two extension methods named Cubes. If those static classes are in the same namespace, we'd never be able to use those extensions methods as extension methods because the compiler would never be able to resolve which one to use. For example: public static class IntegerEnumerableExtensions { public static IEnumerable<int> Squares( this IEnumerable<int> source) { return source.Select(value => value * value); } public static IEnumerable<int> Cubes( this IEnumerable<int> source) { return source.Select(value => value * value * value); } } public static class EnumerableExtensions { public static IEnumerable<int> Cubes( this IEnumerable<int> source) { return source.Select(value => value * value * value); } } If we tried to use Cubes as an extension method, we'd get a compile error, for example: var values = new int[] {1, 2, 3}; foreach (var v in values.Cubes()) { Console.WriteLine(v); } This would result in error CS0121: The call is ambiguous between the following methods or properties. To resolve the problem, we'd need to move one (or both) of the classes to another namespace, for example: namespace Integers { public static class IntegerEnumerableExtensions { public static IEnumerable<int> Squares( this IEnumerable<int> source) { return source.Select(value => value*value); } public static IEnumerable<int> Cubes( this IEnumerable<int> source) { return source.Select(value => value*value*value); } } } namespace Numerical { public static class EnumerableExtensions { public static IEnumerable<int> Cubes( this IEnumerable<int> source) { return source.Select(value => value*value*value); } } } Then, we can scope to a particular namespace to choose which Cubes to use: Context: When considering extension methods, due to potential scoping problems. Practice: Use extension methods sparingly. Context: When designing extension methods. Practice: Keep all extension methods that operate on a specific type in their own class. Context: When designing classes to contain methods to extend a specific type, TypeName. Practice: Consider naming the static class TypeNameExtensions. Context: When designing classes to contain methods to extend a specific type, in order to scope the extension methods. Practice: Consider placing the class in its own namespace. Generally, there isn't much need to use extension methods on types that you own. You can simply add an instance method to contain the logic that you want to have. Where extension methods really shine is for effectively creating instance methods on interfaces. Typically, when code is necessary for shared implementations of interfaces, an abstract base class is created so each implementation of the interface can derive from it to implement these shared methods. This is a bit cumbersome in that it uses the one-and-only inheritance slot in C#, so an interface implementation would not be able to derive or extend any other classes. Additionally, there's no guarantee that a given interface implementation will derive from the abstract base and runs the risk of not being able to be used in the way it was designed. Extension methods get around this problem by being entirely independent from the implementation of an interface while still being able to extend it. One of the most notable examples of this might be the System.Linq.Enumerable class introduced in .NET 3.5. The static Enumerable class almost entirely consists of extension methods that extend IEnumerable. It is easy to develop the same sort of thing for our own interfaces. For example, say we have an ICoordinate interface to model a three-dimensional position in relation to the Earth's surface: namespace ConsoleApplication { using Numerical; internal class Program { private static void Main(string[] args) { var values = new int[] {1, 2, 3}; foreach (var v in values.Cubes()) { Console.WriteLine(v); } } } } We could create a static class to contain extension methods to provide shared functionality between any implementation of ICoordinate. For example: public interface ICoordinate { /// <summary>North/south degrees from equator.</summary> double Latitude { get; set; } /// <summary>East/west degrees from meridian.</summary> double Longitude { get; set; } /// <summary>Distance from sea level in meters.</summary> double Altitude { get; set; } } Context: When designing interfaces that require shared code. Practice: Consider providing extension methods instead of abstract base implementations.

0
0
9723

article-image-software-defined-data-center

Packt

14 Nov 2016

33 min read

The Software-defined Data Center

Packt

14 Nov 2016

33 min read

In this article by Valentin Hamburger, author of the book Building VMware Software-Defined Data Centers, we are introduced and briefed about the software-defined data center (SDDC) that has been introduced by VMware, to further describe the move to a cloud like IT experience. The term software-defined is the important bit of information. It basically means that every key function in the data center is performed and controlled by software, instead of hardware. This opens a whole new way of operating, maintaining but also innovating in a modern data center. (For more resources related to this topic, see here.) But how does a so called SDDC look like – and why is a whole industry pushing so hard towards its adoption? This question might also be a reason why you are reading this article, which is meant to provide a deeper understanding of it and give practical examples and hints how to build and run such a data center. Meanwhile it will also provide the knowledge of mapping business challenges with IT solutions. This is a practice which becomes more and more important these days. IT has come a long way from a pure back office, task oriented role in the early days, to a business relevant asset, which can help organizations to compete with their competition. There has been a major shift from a pure infrastructure provider role to a business enablement function. Today, most organizations business is just as good as their internal IT agility and ability to innovate. There are many examples in various markets where a whole business branch was built on IT innovations such as Netflix, Amazon Web Services, Uber, Airbnb – just to name a few. However, it is unfair to compare any startup with a traditional organization. A startup has one application to maintain and they have to build up a customer base. A traditional organization has a proven and wide customer base and many applications to maintain. So they need to adapt their internal IT to become a digital enterprise, with all the flexibility and agility of a startup, but also maintaining the trust and control over their legacy services. This article will cover the following points: Why is there a demand for SDDC in IT What is SDDC Understand the business challenges and map it to SDDC deliverables The relation of a SDDC and an internal private cloud Identify new data center opportunities and possibilities Become a center of innovation to empower your organizations business The demand for change Today organizations face different challenges in the market to stay relevant. The biggest move was clearly introduced by smartphones and tablets. It was not just a computer in a smaller device, they changed the way IT is delivered and consumed by end users. These devices proved that it can be simple to consume and install applications. Just search in an app store – choose what you like – use it as long as you like it. If you do not need it any longer, simply remove it. All with very simplistic commands and easy to use gestures. More and more people relying on IT services by using a smartphone as their terminal to almost everything. These devices created a demand for fast and easy application and service delivery. So in a way, smartphones have not only transformed the whole mobile market, they also transformed how modern applications and services are delivered from organizations to their customers. Although it would be quite unfair to compare a large enterprise data center with an app store or enterprise service delivery with any app installs on a mobile device, there are startups and industries which rely solely on the smartphone as their target for services, such as Uber or WhatsApp. On the other side, smartphone apps also introduce a whole new way of delivering IT services, since any company never knows how many people will use the app simultaneously. But in the backend they still have to use web servers and databases to continuously provide content and data for these apps. This also introduces a new value model for all other companies. People start to judge a company by the quality of their smartphone apps available. Also people started to migrate to companies which might offer a better smartphone integration as the previous one used. This is not bound to a single industry, but affects a broad spectrum of industries today such as the financial industry, car manufacturers, insurance groups, and even food retailers, just to name a few. A classic data center structure might not be ideal for quick and seamless service delivery. These architectures are created by projects to serve a particular use case for a couple of years. An example of this bigger application environments are web server farms, traditional SAP environments, or a data warehouse. Traditionally these were designed with an assumption about their growth and use. Special project teams have set them up across the data center pillars, as shown in the following figure. Typically, those project teams separate after such the application environment has been completed. All these pillars in the data center are required to work together, but every one of them also needs to mind their own business. Mostly those different divisions also have their own processes which than may integrate in a data center wide process. There was a good reason to structure a data center in this way, the simple fact that nobody can be an expert for every discipline. Companies started to create groups to operate certain areas in a data center, each building their own expertise for their own subject. This was evolving and became the most applied model for IT operations within organizations. Many, if not all, bigger organizations have adopted this approach and people build their careers on this definitions. It served IT well for decades and ensured that each party was adding its best knowledge to any given project. However, this setup has one flaw, it has not been designed for massive change and scale. The bigger these divisions get, the slower they can react to request from other groups in the data center. This introduces a bi-directional issue – since all groups may grow in a similar rate, the overall service delivery time might also increase exponentially. Unfortunately, this also introduces a cost factor when it comes to service deployments across these pillars. Each new service, an organization might introduce or develop, will require each area of IT to contribute. Traditionally, this is done by human hand overs from one department to the other. Each of these hand overs will delay the overall project time or service delivery time, which is also often referred to as time to market. It reflects the needed time interval from the request of a new service to its actual delivery. It is important to mention that this is a level of complexity every modern organization has to deal with, when it comes to application deployment today. The difference between organizations might be in the size of the separate units, but the principle is always the same. Most organizations try to bring their overall service delivery time down to be quicker and more agile. This is often related to business reasons as well as IT cost reasons. In some organizations the time to deliver a brand new service from request to final roll out may take 90 working days. This means – a requestor might wait 18 weeks or more than four and a half month from requesting a new business service to its actual delivery. Do not forget that this reflects the complete service delivery – over all groups until it is ready for production. Also, after these 90 days the requirement of the original request might have changed which would lead into repeating the entire process. Often a quicker time to market is driven by the lines of business (LOB) owners to respond to a competitor in the market, who might already deliver their services faster. This means that today's IT has changed from a pure internal service provider to a business enabler supporting its organization to fight the competition with advanced and innovative services. While this introduces a great chance to the IT department to enable and support their organizations business, it also introduces a threat at the same time. If the internal IT struggles to deliver what the business is asking for, it may lead to leverage shadow IT within the organization. The term shadow IT describes a situation where either the LOBs of an organization or its application developers have grown so disappointed with the internal IT delivery times, that they actually use an external provider for their requirements. This behavior is not agreed with the IT security and can lead to heavy business or legal troubles. This happens more often than one might expect, and it can be as simple as putting some internal files on a public cloud storage provider. These services grant quick results. It is as simple as Register – Download – Use. They are very quick in enrolling new users and sometimes provide a limited use for free. The developer or business owner might not even be aware that there is something non-compliant going on while using this services. So besides the business demand for a quicker service delivery and the security aspect, there an organizations IT department has now also the pressure of staying relevant. But SDDC can provide much more value to the IT than just staying relevant. The automated data center will be an enabler for innovation and trust and introduce a new era of IT delivery. It can not only provide faster service delivery to the business, it can also enable new services or offerings to help the whole organization being innovative for their customers or partners. Business challenges—the use case Today's business strategies often involve a digital delivery of services of any kind. This implies that the requirements a modern organization has towards their internal IT have changed drastically. Unfortunately, the business owners and the IT department tend to have communication issues in some organizations. Sometimes they even operate completely disconnected from each other, as if each of them where their own small company within the organization. Nevertheless, a lot of data center automation projects are driven by enhanced business requirements. In some of these cases, the IT department has not been made aware of what these business requirements look like, or even what the actual business challenges are. Sometimes IT just gets as little information as: We are doing cloud now. This is a dangerous simplification since, the use case is key when it comes to designing and identifying the right solution to solve the organizations challenges. It is important to get the requirements from both sides, the IT delivery side as well as the business requirements and expectations. Here is a simple example how a use case might be identified and mapped to technical implementation. The business view John works as a business owner in an insurance company. He recognizes that their biggest competitor in the market started to offer a mobile application to their clients. The app is simple and allows to do online contract management and tells the clients which products the have enrolled as well as rich information about contract timelines and possible consolidation options. He asks his manager to start a project to also deliver such an application to their customers. Since it is only a simple smartphone application, he expects that it's development might take a couple of weeks and than they can start a beta phase. To be competitive he estimates that they should have something useable for their customers within maximum of 5 months. Based on these facts, he got approval from his manager to request such a product from the internal IT. The IT view Tom is the data center manager of this insurance company. He got informed that the business wants to have a smartphone application to do all kinds of things for the new and existing customers. He is responsible to create a project and bring all necessary people on board to support this project and finally deliver the service to the business. The programming of the app will be done by an external consulting company. Tom discusses a couple of questions regarding this request with his team: How many users do we need to serve? How much time do we need to create this environment? What is the expected level of availability? How much compute power/disk space might be required? After a round of brainstorming and intense discussion, the team still is quite unsure how to answer these questions. For every question there is a couple of variables the team cannot predict. Will only a few of their thousands of users adopt to the app, what if they undersize the middleware environment? What if the user adoption rises within a couple of days, what if it lowers and the environment is over powered and therefor the cost is too high? Tom and his team identified that they need a dynamic solution to be able to serve the business request. He creates a mapping to match possible technical capabilities to the use case. After this mapping was completed, he is using it to discuss with his CIO if and how it can be implemented. Business challenge Question IT capability Easy to use app to win new customers/keep existing How many users do we need to server? Dynamic scale of an environment based on actual performance demand. How much time do we need to create this environment? To fulfill the expectations the environment needs to be flexible. Start small – scale big. What is the expected level of availability? Analytics and monitoring over all layers. Including possible self healing approach. How much compute power/disk space might be required? Create compute nodes based on actual performance requirements on demand. Introduce a capacity on demand model for required resources. Given this table, Tom revealed that with their current data center structure it is quite difficult to deliver what the business is asking for. Also, he got a couple of requirements from other departments, which are going in a similar direction. Based on these mappings, he identified that they need to change their way of deploying services and applications. They will need to use a fair amount of automation. Also, they have to span these functionalities across each data center department as a holistic approach, as shown in the following diagram: In this example, Tom actually identified a very strong use case for SDDC in his company. Based on the actual business requirements of a "simple" application, the whole IT delivery of this company needs to adopt. While this may sound like pure fiction, these are the challenges modern organizations need to face today. It is very important to identify the required capabilities for the entire data center and not just for a single department. You will also have to serve the legacy applications and bring them onto the new model. Therefore it is important to find a solution, which is serving the new business case as well as the legacy applications either way. In the first stage of any SDDC introduction in an organization, it is key to keep always an eye on the big picture. Tools to enable SDDC There is a basic and broadly accepted declaration of what a SDDC needs to offer. It can be considered as the second evolutionary step after server virtualization. It offers an abstraction layer from the infrastructure components such as compute, storage, and network by using automation and tools as such as a self service catalog In a way, it represents a virtualization of the whole data center with the purpose to simplify the request and deployment of complex services. Other capabilities of an SDDC are: Automated infrastructure/service consumption Policy based services and applications deployment Changes to services can be made easily and instantly All infrastructure layers are automated (storage, network, and compute) No human intervention is needed for infrastructure/service deployment High level of standardization is used Business logic is for chargeback or show back functionality All of the preceding points define a SDDC technically. But it is important to understand that a SDDC is considered to solve the business challenges of the organization running it. That means based on the actual business requirements, each SDDC will serve a different use case. Of course there is a main setup you can adopt and roll out – but it is important to understand your organizations business challenges in order to prevent any planning or design shortcomings. Also, to realize this functionality, SDDC needs a couple of software tools. These are designed to work together to deliver a seamless environment. The different parts can be seen like gears in a watch where each gear has an equally important role to make the clockwork function correctly. It is important to remember this when building your SDDC, since missing on one part can make another very complex or even impossible afterwards. This is a list of VMware tools building a SDDC: vRealize Business for Cloud vRealize Operations Manager vRealize Log Insight vRealize Automation vRealize Orchestrator vRealize Automation Converged Blueprint vRealize Code Stream VMware NSX VMware vSphere vRealize Business for Cloud is a charge back/show back tool. It can be used to track cost of services as well as the cost of a whole data center. Since the agility of a SDDC is much higher than for a traditional data center, it is important to track and show also the cost of adding new services. It is not only important from a financial perspective, it also serves as a control mechanism to ensure users are not deploying uncontrolled services and leaving them running even if they are not required anymore. vRealize Operations Manager is serving basically two functionalities. One is to help with the troubleshooting and analytics of the whole SDDC platform. It has an analytics engine, which applies machine learning to the behavior of its monitored components. The other important function is capacity management. It is capable of providing what-if analysis and informs about possible shortcomings of resources way before they occur. These functionalities also use the machine learning algorithms and get more accurate over time. This becomes very important in an dynamic environment where on-demand provisioning is granted. vRealize Log Insight is a unified log management. It offers rich functionality and can search and profile a large amount of log files in seconds. It is recommended to use it as a universal log endpoint for all components in your SDDC. This includes all OSes as well as applications and also your underlying hardware. In an event of error, it is much simpler to have a central log management which is easy searchable and delivers an outcome in seconds. vRealize Automation (vRA) is the base automation tool. It is providing the cloud portal to interact with your SDDC. The portal it provides offers the business logic such as service catalogs, service requests, approvals, and application life cycles. However, it relies strongly on vRealize Orchestrator for its technical automation part. vRA can also tap into external clouds to extend the internal data center. Extending a SDDC is mostly referred to as hybrid cloud. There are a couple of supported cloud offerings vRA can manage. vRealize Orchestrator (vRO) is providing the workflow engine and the technical automation part of the SDDC. It is literally the orchestrator of your new data center. vRO can be easily bound together with vRA to form a very powerful automation suite, where anything with an application programming interface (API) can be integrated. Also it is required to integrate third-party solutions into your deployment workflows, such as configuration management database (CMDB), IP address management (IPAM), or ticketing systems via IT service management (ITSM). vRealize Automation Converged Blueprint was formally known as vRealize Automation Application Services and is an add-on functionality to vRA, which takes care of application installations. It can be used with pre-existing scripts (like Windows PowerShell or Bash on Linux) – but also with variables received from vRA. This makes it very powerful when it comes to on demand application installations. This tool can also make use of vRO to provide even better capabilities for complex application installations. vRealize Code Stream is an addition to vRA and serves specific use cases in the DevOps area of the SDDC. It can be used with various development frameworks such as Jenkins. Also it can be used as a tool for developers to build and operate their own software test, QA and deployment environment. Not only can the developer build these separate stages, the migration from one stage into another can also be fully automated by scripts. This makes it a very powerful tool when it comes to stage and deploy modern and traditional applications within the SDDC. VMware NSX is the network virtualization component. Given the complexity some applications/services might introduce, NSX will provide a good and profound solution to help solving it. The challenges include: Dynamic network creation Microsegmentation Advanced security Network function virtualization VMware vSphere is mostly the base infrastructure and used as the hypervisor for server virtualization. You are probably familiar with vSphere and its functionalities. However, since the SDDC is introducing a change to you data center architecture, it is recommended to re-visit some of the vSphere functionalities and configurations. By using the full potential of vSphere it is possible to save effort when it comes to automation aspects as well as the service/application deployment part of the SDDC. This represents your toolbox required to build the platform for an automated data center. All of them will bring tremendous value and possibilities, but they also will introduce change. It is important that this change needs to be addressed and is a part of the overall SDDC design and installation effort. Embrace the change. The implementation journey While a big part of this article focuses on building and configuring the SDDC, it is important to mention that there are also non-technical aspects to consider. Creating a new way of operating and running your data center will always involve people. It is important to also briefly touch this part of the SDDC. Basically there are three major players when it comes to a fundamental change in any data center, as shown in the following image: Basically there are three major topics relevant for every successful SDDC deployment. Same as for the tools principle, these three disciplines need to work together in order to enable the change and make sure that all benefits can be fully leveraged. These three categories are: People Process Technology The process category Data center processes are as established and settled as IT itself. Beginning with the first operator tasks like changing tapes or starting procedures up to highly sophisticated processes to ensure that the service deployment and management is working as expected they have already come a long way. However, some of these processes might not be fit for purpose anymore, once automation is applied to a data center. To build a SDDC it is very important to revisit data center processes and adopt them to work with the new automation tasks. The tools will offer integration points into processes, but it is equally important to remove bottle necks for the processes as well. However, keep in mind that if you automate a bad process, the process will still be bad – but fully automated. So it is also necessary to re-visit those processes so that they can become slim and effective as well. Remember Tom, the data center manager. He has successfully identified that they need a SDDC to fulfill the business requirements and also did a use case to IT capabilities mapping. While this mapping is mainly talking about what the IT needs to deliver technically, it will also imply that the current IT processes need to adopt to this new delivery model. The process change example in Tom's organization If the compute department works on a service involving OS deployment, they need to fill out an Excel sheet with IP addresses and server names and send it to the networking department. The network admins will ensure that there is no double booking by reserving the IP address and approve the requested host name. After successfully proving the uniqueness of this data, name and IP gets added to the organizations DNS server. The manual part of this process is not longer feasible once the data center enters the automation era – imagine that every time somebody orders a service involving a VM/OS deploy, the network department gets an e-mail containing the Excel with the IP and host name combination. The whole process will have to stop until this step is manually finished. To overcome this, the process has to be changed to use an automated solution for IPAM. The new process has to track IP and host names programmatically to ensure there is no duplication within the entire data center. Also, after successfully checking the uniqueness of the data, it has to be added to the Domain Name System (DNS). While this is a simple example on one small process, normally there is a large number of processes involved which need to be re-viewed for a fully automated data center. This is a very important task and should not be underestimated since it can be a differentiator for success or failure of an SDDC. Think about all other processes in place which are used to control the deploy/enable/install mechanics in your data center. Here is a small example list of questions to ask regarding established processes: What is our current IPAM/DNS process? Do we need to consider a CMDB integration? What is our current ticketing process? (ITSM) What is our process to get resources from network, storage, and compute? What OS/VM deployment process is currently in place? What is our process to deploy an application (hand overs, steps, or departments involved)? What does our current approval process look like? Do we need a technical approval to deliver a service? Do we need a business approval to deliver a service? What integration process do we have for a service/application deployment? DNS, Active Directory (AD), Dynamic Host Configuration Protocol (DHCP), routing, Information Technology Infrastructure Library (ITIL), and so on Now for the approval question, normally these are an exception for the automation part, since approvals are meant to be manual in the first place (either technical or business). If all the other answers to this example questions involve human interaction as well, consider to change these processes to be fully automated by the SDDC. Since human intervention creates waiting times, it has to be avoided during service deployments in any automated data center. Think of it as the robotic construction bands todays car manufacturers are using. The processes they have implemented, developed over ages of experience, are all designed to stop the band only in case of an emergency. The same comes true for the SDDC – try to enable the automated deployment through your processes, stop the automation only in case of an emergency. Identifying processes is the simple part, changing them is the tricky part. However, keep in mind that this is an all new model of IT delivery, therefore there is no golden way of doing it. Once you have committed to change those processes, keep monitoring if they truly fulfill their requirement. This leads to another process principle in the SDDC: Continual Service Improvement (CSI). Re-visit what you have changed from time to time and make sure that those processes are still working as expected, if they don't, change them again. The people category Since every data center is run by people, it is important to also consider that a change of technology will also impact those people. There are some claims that a SDDC can be run with only half of the staff or save a couple of employees since all is automated. The truth is, a SDDC will transform IT roles in a data center. This means that some classic roles might vanish, while others will be added by this change. It is unrealistic to say that you can run an automated data center with half the staff than before. But it is realistic to say that your staff can concentrate on innovation and development instead of working a 100% to keep the lights on. And this is the change an automated data center introduces. It opens up the possibilities to evolve into a more architecture and design focused role for current administrators. The people example in Tom's organization Currently there are two admins in the compute department working for Tom. They are managing and maintaining the virtual environment, which is largely VMware vSphere. They are creating VMs manually, deploying an OS by a network install routine (which was a requirement for physical installs – so they kept the process) and than handing the ready VMs over to the next department to finish installing the service they are meant for. Recently they have experienced a lot of demand for VMs and each of them configures 10 to 12 VMs per day. Given this, they cannot concentrate on other aspects of their job, like improving OS deployments or the hand over process. At a first look it seems like the SDDC might replace these two employees since the tools will largely automate their work. But that is like saying a jackhammer will replace a construction worker. Actually their roles will shift to a more architectural aspect. They need to come up with a template for OS installations and an improvement how to further automate the deployment process. Also they might need to add new services/parts to the SDDC in order to fulfill the business needs continuously. So instead of creating all the VMs manually, they are now focused on designing a blueprint, able to be replicated as easy and efficient as possible. While their tasks might have changed, their workforce is still important to operate and run the SDDC. However, given that they focus on design and architectural tasks now, they also have the time to introduce innovative functions and additions to the data center. Keep in mind that an automated data center affects all departments in an IT organization. This means that also the tasks of the network and storage as well as application and database teams will change. In fact, in a SDDC it is quite impossible to still operate the departments disconnected from each other since a deployment will affect all of them. This also implies that all of these departments will have admins shifting to higher-level functions in order to make the automation possible. In the industry, this shift is also often referred to as Operational Transformation. This basically means that not only the tools have to be in place, you also have to change the way how the staff operates the data center. In most cases organizations decide to form a so-called center of excellence (CoE) to administer and operate the automated data center. This virtual group of admins in a data center is very similar to project groups in traditional data centers. The difference is that these people should be permanently assigned to the CoE for a SDDC. Typically you might have one champion from each department taking part in this virtual team. Each person acts as an expert and ambassador for their department. With this principle, it can be ensured that decisions and overlapping processes are well defined and ready to function across the departments. Also, as an ambassador, each participant should advertise the new functionalities within their department and enable their colleagues to fully support the new data center approach. It is important to have good expertise in terms of technology as well as good communication skills for each member of the CoE. The technology category This is the third aspect of the triangle to successfully implement a SDDC in your environment. Often this is the part where people spend most of their attention, sometimes by ignoring one of the other two parts. However, it is important to note that all three topics need to be equally considered. Think of it like a three legged chair, if one leg is missing it can never stand. The term technology does not necessarily only refer to new tools required to deploy services. It also refers to already established technology, which has to be integrated with the automation toolset (often referred to as third-party integration). This might be your AD, DHCP server, e-mail system, and so on. There might be technology which is not enabling or empowering the data center automation, so instead of only thinking about adding tools, there might also be tools to be removed or replaced. This is a normal IT lifecycle task and has been gone through many iterations already. Think of things like a fax machine or the telex – you might not use them anymore, they have been replaced by e-mail and messaging. The technology example in Tom's organization The team uses some tools to make their daily work easier when it comes to new service deployments. One of the tools is a little graphical user interface to quickly add content to AD. The admins use it to insert the host name, Organizational Unit as well as creating the computer account with it. This was meant to save admin time, since they don't have to open all the various menus in the AD configuration to accomplish these tasks. With the automated service delivery, this has to be done programmatically. Once a new OS is deployed it has to be added to the AD including all requirements by the deployment tool. Since AD offers an API this can be easily automated and integrated into the deployment automation. Instead of painfully integrating the graphical tool, this is now done directly by interfacing the organizations AD, ultimately replacing the old graphical tool. The automated deployment of a service across the entire data center requires a fair amount of communication. Not in a traditional way, but machine-to-machine communication leveraging programmable interfaces. Using such APIs is another important aspect of the applied data center technologies. Most of today's data center tools, from backup all the way up to web servers, do come with APIs. The better the API is documented, the easier the integration into the automation tool. In some cases you might need the vendors to support you with the integration of their tools. If you have identified a tool in the data center, which does not offer any API or even command-line interface (CLI) option at all, try to find a way around this software or even consider replacing it with a new tool. APIs are the equivalent of hand overs in the manual world. The better the communication works between tools, the faster and easier the deployment will be completed. To coordinate and control all this communication, you will need far more than scripts to run. This is a task for an orchestrator, which can run all necessary integration workflows from a central point. This orchestrator will act like a conductor for a big orchestra. It will form the backbone of your SDDC. Why are these three topics so important? The technology aspect closes the triangle and brings the people and the processes parts together. If the processes are not altered to fit the new deployment methods – automation will be painful and complex to implement. If the deployment stops at some point, since the processes require manual intervention, the people will have to fill in this gap. This means that they now have new roles, but also need to maintain some of their old tasks to keep the process running. By introducing such an unbalanced implementation of an automated data center, the workload for people can actually increase, while the service delivery times may not dramatically decrease. This may lead to an avoidance of the automated tasks, since the manual intervention might seen as faster by individual admins. So it is very important to accept all three aspects as the main part of the SDDC implementation journey. They all need to be addressed equally and thoughtfully to unveil the benefits and improvements an automated data center has to offer. However, keep in mind that this truly is a journey. A SDDC is not implemented in days but in month. Given this, also the implementation team in the data center has this time to adopt themselves and their process to this new way of delivering IT services. Also all necessary departments and their lead needs to be involved in this procedure. A SDDC implementation is always a team effort. Additional possibilities and opportunities All the previews mentioned topics serve the sole goal to install and use the SDDC within your data center. However, once you have the SDDC running the real fun begins since you can start to introduce additional functionalities impossible for any traditional data center. Lets just briefly touch on some of the possibilities from an IT view. The self-healing data center This is a concept where the automatic deployment of services is connected to a monitoring system. Once the monitoring system detects that a service or environment may be facing constraints, it can automatically trigger an additional deployment for this service to increase the throughput. While this is application dependent, for infrastructure services this can become quite handy. Think of ESXi host auto deployments if compute power is becoming a constraint, or data store deployments if disk space is running low. If this automation is acting to aggressive for your organization, it can be used with an approval function. Once the monitoring detects a shortcoming it will ask for approval to fix it with a deployment action. Instead of getting an e-mail from your monitoring system that there is a constraint identified, you get an e-mail with the constraint and the resolving action. All you need to do is to approve the action. The self-scaling data center A similar principle is to use a capacity management tool to predict the growth of your environment. If it approaches a trigger, the system can automatically generate an order letter, containing all needed components to satisfy the growing capacity demands. This can than be sent to finance or the purchasing management for approval and before you even get into any capacity constraints, the new gear might be available and ready to run. However, consider the regular turnaround time for ordering hardware, which might affect how far in the future you have to set the trigger for such functionality. Both of this opportunities are more than just nice to haves, they enable your data center to be truly flexible and proactive. Due to the fact that a SDDC is offering a high amount of agility, it will also need some self-monitoring to stay flexible and useable and to fulfill unpredicted demand. Summary In this article we discussed the main principles and declarations of an SDDC. It provided an overview of the opportunities and possibilities this new data center architecture provides. Also, it covered the changes which will be introduced by this new approach. Finally it discussed the implementation journey and its involvement with people, processes and technology. Resources for Article: Further resources on this subject: VM, It Is Not What You Think! [article] Introducing vSphere vMotion [article] Creating a VM using VirtualBox - Ubuntu Linux [article]

0
0
9714

Packt

18 Sep 2014

8 min read

Redis in Autosuggest

Packt

18 Sep 2014

8 min read

0
0
9713

article-image-interact-with-hbase-using-hbase-shell-tutorial

Amey Varangaonkar

25 Jun 2018

9 min read

How to interact with HBase using HBase shell [Tutorial]

Amey Varangaonkar

25 Jun 2018

9 min read

0
0
9711

article-image-chatgpt-for-ab-testing-in-marketing-campaigns

Valentina Alto

22 Sep 2023

5 min read

ChatGPT for A/B Testing in Marketing Campaigns

Valentina Alto

22 Sep 2023

5 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Modern Generative AI with ChatGPT and OpenAI Models, by Valentina Alto. Master core data architecture design concepts and Azure Data & AI services to gain a cloud data and AI architect’s perspective to developing end-to-end solutions.IntroductionIn the ever-evolving landscape of digital marketing, staying competitive and meeting customer expectations is paramount. This article explores the revolutionary potential of ChatGPT in enhancing multiple aspects of marketing. From refining A/B testing strategies to elevating SEO optimization techniques and harnessing sentiment analysis for measuring customer satisfaction, ChatGPT emerges as a pivotal tool. A/B testing for marketing comparisonAnother interesting field where ChatGPT can assist marketers is A/B testing.A/B testing in marketing is a method of comparing two different versions of a marketing campaign, advertisement, or website to determine which one performs better. In A/B testing, two variations of the same campaign or element are created, with only one variable changed between the two versions. The goal is to see which version generates more clicks, conversions, or other desired outcomes.An example of A/B testing might be testing two versions of an email campaign, using different subject lines, or testing two versions of a website landing page, with different call-to-action buttons. By measuring the response rate of each version, marketers can determine which version performs better and make data-driven decisions about which version to use going forward.A/B testing allows marketers to optimize their campaigns and elements for maximum effectiveness, leading to better results and a higher return on investment.Since this method involves the process of generating many variations of the same content, the generative power of ChatGPT can definitely assist in that.Let’s consider the following example. I’m promoting a new product I developed: a new, light and thin climbing harness for speed climbers. I’ve already done some market research and I know my niche audience. I also know that one great channel of communication for that audience is publishing on an online climbing blog, of which most climbing gyms’ members are fellow readers.My goal is to create an outstanding blog post to share the launch of this new harness, and I want to test two different versions of it in two groups. The blog post I’m about to publish and that I want to be the object of my A/B testing is the following:Figure – An example of a blog post to launch climbing gearHere, ChatGPT can help us on two levels:The first level is that of rewording the article, using different keywords or different attention grabbing slogans. To do so, once this post is provided as context, we can ask ChatGPT to work on the article and slightly change some elements:Figure – New version of the blog post generated by ChatGPTAs per my request, ChatGPT was able to regenerate only those elements I asked for (title, subtitle, and closing sentence) so that I can monitor the effectiveness of those elements by monitoring the reaction of the two audience groups.The second level is working on the design of the web page, namely, changing the collocation of the image rather than the position of the buttons. For this purpose, I created a simple web page for the blog post published in the climbing blog (you can find the code in the book’s GitHub repository at https://github.com/PacktPublishing/The-Ultimate-Guideto-ChatGPT-and-OpenAI/tree/main/Chapter%207%20-%20ChatGPT%20 for%20Marketers/Code):Figure – Sample blog post published on the climbing blogWe can directly feed ChatGPT with the HTML code and ask it to change some layout elements, such as the position of the buttons or their wording. For example, rather than Buy Now, a reader might be more gripped by an I want one! button.So, lets feed ChatGPT with the HTML source code:Figure – ChatGPT changing HTML codeLet’s see what the output looks like:Figure – New version of the websiteAs you can see, ChatGPT only intervened at the button level, slightly changing their layout, position, color, and wording.Indeed, inspecting the source code of the two versions of the web pages, we can see how it differs in the button sections:Figure – Comparison between the source code of the two versions of the websiteConclusionChatGPT is a valuable tool for A/B testing in marketing. Its ability to quickly generate different versions of the same content can reduce the time to market of new campaigns. By utilizing ChatGPT for A/B testing, you can optimize your marketing strategies and ultimately drive better results for your business.Author BioValentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industry. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.

0
0
9700

Common WLAN Protection Mechanisms and their Flaws

Quick start – creating your first grid

Installing OpenVPN on Linux and Unix Systems: Part 1

Apache Solr and Big Data – integration with MongoDB

Analyzing Moby Dick through frequency distribution with NLTK

Mastering CentOS 7 Linux Server

ChatGPT for Natural Language Processing (NLP)

Design Tools and Basics

Oracle VM Management

HDFS and MapReduce

Trending Topics

Low-level C# Practices

The Software-defined Data Center

Redis in Autosuggest

How to interact with HBase using HBase shell [Tutorial]

ChatGPT for A/B Testing in Marketing Campaigns

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access