Tech Guides

article-image-do-you-need-to-be-polyglot-great-programmer

19 Jan 2018

6 min read

Do you need to be a polyglot to be a great programmer?

19 Jan 2018

Recently, I was talking to someone who has been working as a developer for over a year. They asked me which programming languages they should learn in order to improve their employability and to grow as a developer. This made me think: Do we really need to be a polyglot to be a good programmer? A polyglot programmer is someone who can write code in multiple languages. Most of us are already using multiple programming languages. Someone working on web apps uses HTML, CSS, and JavaScript. Similarly, backend services might be written in a specific language, but the developer might still be using SQL for database queries or YAML for configuration files. As developers, we like to try and learn new programming languages and frameworks. We do this for many reasons, to solve specific problems, to find a better alternative, or simply to keep ourselves up to date with what's new and trending. The benefits of being a polyglot programmer There are obvious benefits of being a polyglot developer. It increases your employability. Being proficient in multiple languages looks very good on your resume. It shows your experience as a developer and also indicates that you are flexible, able to work with different tools in different situations. It provides you with more opportunities and greater variety. When you’re looking for a new job or maybe even in your current role, if you are able to write code in multiple languages many more opportunities open up to you. When you're a polyglot you become much more in control of your career destiny! Developer happiness. Many developers simply feel more productive when they are using a specific language. But to know what you enjoy, you need to be open minded and willing to explore lots of different languages. Polyglots get to try out different syntaxes, get to know different communities – and this exploration is surely one of the best things about being a developer. Along with all these benefits, working with different languages give us a chance to learn about different programming paradigms. We can learn different ways of solving a problem and different ways of thinking. We can then bring all this learning together to write better code. The challenges While there are many benefits of learning and knowing multiple programming languages, this constant learning comes with its own challenges. Lack of proficiency: In his book "JavaScript: The Good Parts," Douglas Crockford talks about good and bad parts of JavaScript. Similarly, other languages also have certain aspects that should be approached with caution. If you’re frequently changing programming languages without spending enough time to learn one properly, you might run into issues around things like performance and security. Maintenance becomes a nightmare. Having too many languages in a tech stack will likely become a maintenance nightmare for both the development and the operations side. This will take you somewhere that is the opposite of agile and efficient. Developer fatigue. Constantly learning and adapting to new languages and technology may result in developer fatigue. It’s a fact of tech today that developers feel stressed and under pressure – this is bound to affect not only their productivity but their health as well. From an organization's perspective, there are tradeoffs when adding a new language to their tech stack. There may be operational costs and costs to up-skill the team. On the upside, code quality and productivity may improve. Companies who avoid investing in up-skilling their teams and upgrading their tech stack may end up with systems that are difficult to maintain. Even small changes may take weeks to deliver and finding skilled developers can become challenging. On the other hand, constantly changing programming languages and technology may result in features not getting delivered for months; in some cases years. There are many cases where a project started in one programming language and after years of development, the team decided to rewrite the whole system in a newer language or framework. While architectures like microservices solve some of these problems by allowing us to write different parts of a given system in different languages without needing to rewrite the whole system, it is important to understand the cost of introducing a new language. The benefits we get out of it should always outweigh the cost. "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." - Martin Fowler How to become a better developer Learning different programming languages is one way to grow as a developer but there are others things we can do to improve. Write clean code. As developers, we spend more time reading code than writing it. Writing code that is easy to read and understand is one of the key traits of a good developer. Write easy to maintain code. A good programmer puts in extra effort to make sure that the code is easy to maintain. Use design principles and test-driven development to make sure that the code can be modified with ease and confidence that it is not going to affect the existing functionality. Understand the problem. A good developer will try and understand the problem and then pick the appropriate tool to solve the problem instead of starting with a technology just because it's trending. There are lots of obvious advantages of learning multiple programming languages. Not only does it look good on a resume, it also helps you to improve as a developer. However, it is just as important to understand the business problems you’re trying to solve. Whether you’re a polyglot or not, the most important thing any developer can do is focus on the problems instead of the tool. I hope you enjoyed this post; please let us know what you think! Are you a polyglot? Do you think trying to become one is important today? Amit Kothari is a full stack software developer based in Melbourne, Australia. He has 10+ years experience in designing and implementing software mainly in Java/JEE. His recent experience is in building web applications using JavaScript frameworks like React and AngularJS and backend micro services/ REST API in Java. He is passionate about lean software development and continuous delivery.

0
2
15378

Tech Guides

article-image-13-reasons-exit-polls-wrong

Sugandha Lahoti

13 Nov 2017

7 min read

13 reasons why Exit Polls get it wrong sometimes

Sugandha Lahoti

13 Nov 2017

7 min read

An Exit poll, as the name suggests, is a poll taken immediately after voters exit the polling booth. Private companies working for popular newspapers or media organizations conduct these exit polls and are popularly known as pollsters. Once the data is collected, data analysis and estimation is used to predict the winning party and the number of seats captured. Turnout models which are built using logistic regression or random forest techniques are used for prediction of turnouts in the exit poll results. Exit polls are dependent on sampling. Hence a margin of error does exist. This describes how close pollsters are in expecting an election result relative to the true population value. Normally, a margin of error plus or minus 3 percentage points is acceptable. However, in the recent times, there have been instances where the poll average was off by a larger percentage. Let us analyze some of the reasons why exit polls can get their predictions wrong. [dropcap]1[/dropcap] Sampling inaccuracy/quality Exit polls are dependent on the sample size, i.e. the number of respondents or the number of precincts chosen. Incorrect estimation of this may lead to error margins. The quality of sample data also matters. This includes factors such as whether the selected precincts are representative of the state, whether the polled audience in each precinct represents the whole etc. [dropcap]2[/dropcap] Model did not consider multiple turnout scenarios Voter turnout refers to the percentage of voters who cast a vote during an election. Pollsters may often misinterpret the number of people who actually vote based on the total no. of the population eligible to vote. Also, they often base their turnout prediction on past trends. However, voter turnout is dependent on many factors. For example, some voters might not turn up due to reasons such as indifference or a feeling of perception that their vote might not count--which is not true. In such cases, the pollsters adjust the weighting to reflect high or low turnout conditions by keeping the total turnout count in mind. The observations taken during a low turnout is also considered and the weights are adjusted therein. In short, pollsters try their best to maintain the original data. [dropcap]3[/dropcap] Model did not consider past patterns Pollsters may commit a mistake by not delving into the past. They can gauge the current turnout rates by taking into the account the presidential turnout votes or the previous midterm elections. Although, one may assume that the turnout percentage over the years have been stable a check on the past voter turnout is a must. [dropcap]4[/dropcap] Model was not recalibrated for year and time of election such as odd-year midterms Timing is a very crucial factor in getting the right traction for people to vote. At times, some social issues would be much more hyped and talked-about than the elections. For instance, the news of the Ebola virus breakout in Texas was more prominent than news about the contestants standing in the mid 2014 elections. Another example would be an election day set on a Friday versus on any other weekday. [dropcap]5[/dropcap] Number of contestants Everyone has a personal favorite. In cases where there are just two contestants, it is straightforward to arrive at a clear winner. For pollsters, it is easier to predict votes when the whole world's talking about it, and they know which candidate is most talked about. With the increase in the number of candidates, the task to carry out an accurate survey is challenging for the pollsters. They have to reach out to more respondents to carry out the survey required in an effective manner. [dropcap]6[/dropcap] Swing voters/undecided respondents Another possible explanation for discrepancies in poll predictions and the outcome is due to a large proportion of undecided voters in the poll samples. Possible solutions could be Asking relative questions instead of absolute ones Allotment of undecided voters in proportion to party support levels while making estimates [dropcap]7[/dropcap] Number of down-ballot races Sometimes a popular party leader helps in attracting votes to another less popular candidate of the same party. This is the down-ballot effect. At times, down-ballot candidates may receive more votes than party leader candidates, even when third-party candidates are included. Also, down-ballot outcomes tend to be influenced by the turnout for the polls at the top of the ballot. So the number of down-ballot races need to be taken into account. [dropcap]8[/dropcap] The cost incurred to commission a quality poll A huge capital investment is required in order to commission a quality poll. The cost incurred for a poll depends on the sample size, i.e. the number of people interviewed, the length of the questionnaire--longer the interview, more expensive it becomes, the time within which interviews must be conducted, are some contributing factors. Also, if a polling firm is hired or if cell phones are included to carry out a survey, it will definitely add up to the expense. [dropcap]9[/dropcap] Over-relying on historical precedence Historical precedence is an estimate of the type of people who have shown up previously on a similar type of election. This precedent should also be taken into consideration for better estimation of election results. However, care should be taken not to over-rely on it. [dropcap]10[/dropcap] Effect of statewide ballot measures Poll estimates are also dependent on state and local governments. Certain issues are pushed by local ballot measures. However, some voters feel that power over specific issues should belong exclusively to state governments. This causes opposition to local ballot measures in some states. These issues should be taken into account while estimation for better result prediction. [dropcap]11[/dropcap] Oversampling due to various factors such as faulty survey design, respondents’ willingness/unwillingness to participate etc Exit polls may also sometimes oversample voters for many reasons. One example of this is related to the people of US with cultural ties to Latin America. Although, more than one-fourth of Latino voters prefer speaking Spanish to English, yet exit polls are almost never offered in Spanish. This might oversample English speaking Latinos. [dropcap]12[/dropcap] Social desirability bias in respondents People may not always tell the truth about who they voted for. In other words, when asked by pollsters they are likely to place themselves on the safer side, as exit polls is a sensitive topic. The voters happen to tell pollsters that they have voted for a minority candidate, but they have actually voted against the minority candidate. Social Desirability has no linking to issues with race or gender. It is just that people like to be liked and like to be seen as doing what everyone else is doing or what the “right” thing to do is, i.e., they play safe. Brexit polling, for instance, showed stronger signs of Social desirability bias. [dropcap]13[/dropcap] The spiral of silence theory People may not reveal their true thoughts to news reporters as they may believe media has an inherent bias. Voters may not come out to declare their stand publicly in fear of reprisal or the fear of isolation. They choose to remain silent. This may also hinder estimate calculation for pollsters. The above is just a shortlist of a long list of reasons why exit poll results must be taken with a pinch of salt. However, even with all its shortcomings, the striking feature of an exit poll is the fact that rather than predicting about a future action, it records an action that has just happened. So you rely on present indicators rather than ambiguous historical data. Exit polls are also cost-effective in obtaining very large samples. If these exit polls are conducted properly, keeping in consideration the points described above, they can predict election results with greater reliability.

0
0
15281

article-image-gomobile-golangs-foray-mobile-world

Erik Kappelman

15 Feb 2017

6 min read

GoMobile: GoLang's Foray into the Mobile World

Erik Kappelman

15 Feb 2017

6 min read

There is no question that the trend today in mobile app design is to get every possible language onboard for creating mobile applications. This is sort of the case with GoMobile. Far from being originally intended to create mobile apps, Go or GoLang, was originally created at Google in 2007. Go has true concurrency capabilities, which can lend itself well to any programming task, certainly mobile app creation. The first thing you need to do to follow along with this blog is get the GoLang binaries on your machine. Although there is a GCC tool to compile Go, I would strongly recommend using the Go tools. I like Go because it is powerful, safe, and it feels new. It may simply be a personal preference, but I think Go is a largely underrated language. This blog will assume a minimum understanding of Go; don’t worry so much about the syntax, but you will need to understand how Go handles projects and packages. So to begin, let's create a new folder and specify it as our $GOPATH bash variable. This tells Go where to look for code and where to place the downloaded packages, such asGoMobile. After we specify our $GOPATH, we add the bin subdirectory of the $GOPATH to our global $PATH variable. This allows for the execution of Go tools like any other bash command: $ cd ~ $ mkdirGoMobile $ export$GOPATH=~/GoMobile $ export PATH=$PATH:$GOPATH/bin The next step is somewhat more convoluted. Today, we are getting started with Android development. I choose Android over iOS because GoMobile can build for Android on any platform, but can only build for iOS on OSX. In order for GoMobile to be able to work its magic, you’ll need to install Android NDK. I think the easiest way to do this is through Android Studio. Once you have the Android NDK installed, it’s time to get started. We are going to be using an example app from our friends over at Go today. The app structure required for Go-based mobile apps is fairly complex. With this in mind, I would suggest using this codebase as you begin developing your own apps. This might save you some time. So, let's first install GoMobile: $ go get golang.org/x/mobile/cmd/gomobile Now, let's get that example app: $ go get -d golang.org/x/mobile/example/basic For the next command, we are going to initialize GoMobile and specify the NDK location. The online help for this example is somewhat vague when it comes to specifying the NDK location, so hopefully my research will save you some time: $ gomobileinit -ndk=$HOME/Library/Android/sdk/ndk-bundle/ Obviously, this is the path on my machine, so yours may be different:however,if you’re on anything Unix-like, it ought to be relatively close. At this point, you are ready to build the example app. All you have to do is use the command below, and you’ll be left with a real live Android application: $ gomobile build golang.org/x/mobile/example/basic This will build an APK file and place the file in your $GOPATH. This file can be transferred to and installed on an actual Android device, or you can use an emulator. To use the emulator, you’ll need to install the APK file using the adb command. This command should already be onboard with your installation of Android Studio. The following command adds the adb command to your path(your path might be different, but you’ll get the idea): export PATH=$PATH:$HOME/Library/Android/sdk/platform-tools/ At this point, you ought to be able to run the adb install command and try out the app on your emulator: adb install basic.apk As you will see, there isn’t much to this particular app, but in this case, it’s about the journey and not the destination. There is another way to install the app on your emulator. First, uninstall the app from your Android VM. Second, run the following command: gomobile install golang.org/x/mobile/example/basic Although the result is the same, the second method is almost identical to the way that the regular Go builds applications. For consistency's sake, I would recommend using the second method. If you're new to Go, at this point, I would recommend checking out some of the documentation. There is an interactive tutorial called A Tour of Go. I have found this tutorial enormously helpful for beginning to intermediate needs. You will need to have a pretty deep understanding of Go to be an effective mobile app developer. If you are new to mobile app design in general, e.g., you don’t already know Java, I would recommend taking the Go route. Although Java is still the most widely used language the world over, I myself have a strong preference for Go. If you will be using Go in the other elements of your mobile app, say maybe, a web server that controls access to data required for the app’s operations, using Go and GoMobile can be even more helpful. This allows for code consistency across the various levels of a mobile app. This is similar to the benefit of using the MEAN stack for web development in that one language controls all the levels of the app. In fact, there are tools now that allow for JavaScript to be used in the creation of mobile apps, and then, presumably, a developer could use Node.js for a backend server ending up with a MEAN-like mobile stack. While this would probably work fine, Go is stronger and perhaps safer than JavaScript. Also, because mobile development is essentially software development, which is fundamentally different fromweb development, using a language geared toward software development makes more intuitive sense. However, these thoughts are largely opinions and I have said before in many blogs, there are so many options, just find the one you like that gets the job done. About the Author Erik Kappelman is a transportation modeler for the Montana Department of Transportation. He is also the CEO of Duplovici, a technology consulting and web design company.

0
0
15266

Tech Guides

article-image-4-insights-stack-overflow-survey-might-surprise-you

Richard Gall

27 Mar 2018

3 min read

4 surprising things from Stack Overflow's 2018 survey

Richard Gall

27 Mar 2018

3 min read

This year’s Stack Overflow survey features a wealth of insights on developers around the world. There were some takeaways that are worth noting and open the door to wider investigation. Here are 4 Stack Overflow survey highlights we think merit further discussion... 25% of developers think a regulatory body should be responsible for AI ethics The number of developers who believed a regulatory body should be responsible for AI ethics was a minority - more believed developers themselves should be responsible for ethical decisions around the artificial intelligence that they help to build. However, the fact that 1 in 4 of Stack Overflow’s survey respondents believe we need a regulatory body to monitor ethics in AI is not to be ignored - even if for the most part developers believe they are best placed to make ethical decisions, that feeling is far from unanimous. This means there is some unease about ethics and artificial intelligence that is, at the very least, worth talking about in more detail. The ethics of code remains a gray area There were a number of interesting questions around writing code for ethical purposes in this year’s survey. 58.5% of respondents said they wouldn’t write unethical code if they were asked to, 4.8% said they would, with 35.6% saying that it depends on what it is. Clearly, the notion of ethical code remains something that needs to be properly addressed within the developer and tech community. The recent Facebook and Cambridge Analytica scandal have only served to emphasize this. Equally interesting was the responses to the question around responsibility for ethical code. 57.5% said upper management were responsible for code that accomplishes something ethical, but 22.8% said it was ‘the person who came up with the idea’ and 19.7% said ‘the developer who wrote it’. Hackathons and coding competitions are a crucial part of developer learning 26% of respondents learned new skills in hackathons. When you compare that to 35% of people who say they’re getting on the job training it’s easy to see just how important a role hackathons play in the professional development of developers. A similar proportion (24.3%) said coding competitions were also an important part of their technical education. When you put the two together, there’s obvious evidence that software learning is happening in the community more than in the workplace. Arguably, today’s organizations are growing and innovating on the back of developer curiosity and ingenuity. Transgender and non-binary programmers contribute to open source at high rates This probably will go largely unnoticed but it's worth underlining this. It was, in fact, one of the Stack Overflow survey's highlights: “developers who identify as transgender and non-binary contribute to open source at higher rates (58% and 60%, respectively) than developers who identify as men or women overall (45% and 33%).” This is a great statistic and one that’s important to recognize among the diversity problems within technology. It is, perhaps, a positive signal, that things are changing.

0
0
15236

Tech Guides

article-image-technical-debt-is-damaging-businesses

Richard Gall

11 Jun 2018

5 min read

Technical debt is damaging businesses

Richard Gall

11 Jun 2018

5 min read

A lot of things make working in tech difficult. Technical debt is one of them. Whether you're working in-house or for an external team, you've probably experienced some tough challenges when it comes to legacy software. Most people have encountered strange internal software systems, a CMS that has been customized in a way that no one has the energy to fathom. Working your way around and through these can be a headache to say the least. In this year's Skill Up survey, we found that Technical debt and legacy issues are seen by developers as the biggest barrier to business goals. According to 49% of respondents, old technology and software is stopping organizations from reaching their full potential. But it might also be stopping developers from moving forward in their careers. Read the report in full. Sign up to our newsletter and download the PDF for free. Technical debt and the rise of open source Arguably, issues around technical debt have become more pronounced in the last decade as the pace of technical change has seemingly increased. I say seemingly, because it's not so much that we're living in an entirely new technical landscape. It's more that the horizons of that landscape are expanding. There are more possibilities and options open to businesses today. Technology leadership is difficult in 2018. To do it well, you need to stay on top of new technologies. But you also need a solid understanding of your internal systems, your team, as well as wider strategic initiatives and business goals. There are a lot of threads you need to manage. Are technology leaders struggling with technical debt? Perhaps technology leaders are struggling. But perhaps they're also making the best of difficult situations. When you're juggling multiple threads in the way I've described, you need to remain focused on what's important. Ultimately, that's delivering software that delivers value. True, your new mobile app might not be ideal; the internal CMS you were building for a client might not offer an exemplary user experience. But it still does the job - and that, surely is the most important thing? We can do better - let's solve technical debt together It's important to be realistic. In the age of burn out and over work, let's not beat ourselves up when things aren't quite what we want. Much of software engineering is, after all, making the best of a bad situation. But the solutions to technical debt can probably be found in a cultural shift. The lack of understanding of technology on the part of management is surely a large cause of technical debt. When projects aren't properly scoped and when deadlines are set without a clear sense of what level of work is required, that's when legacy issues begin to become a problem. In fact, it's worth looking at all the other barriers. In many ways, they are each a piece of the puzzle if we are to use technology more effectively - more imaginatively - to solve business problems. Take these three: Lack of quality training or learning Team resources Lack of investment in projects All of these point to a wider cultural problem with the way software is viewed in businesses. There's no investment, teams are under-resourced, and support to learn and develop new skills is simply not being provided. With this lack of regard for software, it's unsurprising that developers are spending more time solving problems on, say, legacy code, than solving big, interesting problems. Ones that might actually have a big impact. One way of solving technical debt, then, is to make a concerted effort to change the cultural mindset. Yes, some of this will need to come from senior management, but all software engineers need to take responsibility. This means better communication and collaboration, a commitment to documentation - those things that are so easy to forget to do well when you could be shipping code. What happens if we don't start solving technical debt Technical debt is like global warming - it's happening already. We feel the effects every day. However, it's only going to get worse. Yes, it's going to damage businesses, but it's also going to hurt developers. It's restricting the scope of developers to do the work they want to do and make a significant impact on their businesses. It seems as though we're locked in a strange cycle where businesses talk about the importance of 'digital skills' and technical knowledge gaps but ironically can't offer the resources or scope for talented developers to actually do their job properly. Developers bring skills, ideas, and creativity to jobs only to find that they're isn't really time to indulge that creativity. "Maybe next year, when we have more time" goes the common refrain. There's never going to be more time - that's obvious to anyone who's ever had a job, engineer or otherwise. So why not take steps to start solving technical debt now? Read next 8 Reasons why architects love API driven architecture Python, Tensorflow, Excel and more – Data professionals reveal their top tools The best backend tools in web development

0
0
15203

article-image-is-blockchain-a-failing-trend-or-can-it-build-a-better-world-harish-garg-provides-his-insight-interview

Packt Editorial Staff

02 Jan 2019

4 min read

Is Blockchain a failing trend or can it build a better world? Harish Garg provides his insight [Interview]

Packt Editorial Staff

02 Jan 2019

4 min read

In 2018, Blockchain and cryptocurrency exploded across tech. We spoke to Packt author Harish Garg on what they see as the future of Blockchain in 2019 and beyond. Harish Garg, founder of BignumWorks Software LLP, is a data scientist and lead software developer with 17 years' software industry experience. BignumWorks is an India-based software consultancy that provides consultancy services in software development and technical training. Harish has worked for McAfee\Intel for 11+ years. He is an expert in creating data visualizations using R, Python, and web-based visualization libraries. Find all of Harish Garg's books for Packt here. From early adopters to the enterprise What do you think was the biggest development in blockchain during 2018? The biggest development in Blockchain during 2018 was the explosion of Blockchain based digital currencies. We have now thousands of different coins and projects supported by these coins. 2018 was also the year when Blockchain really captured the imagination of public at large, beyond just technical savvy early adopters. 2018 also saw first a dramatic rise in the price of digital currencies, especially Bitcoin and then a similar dramatic fall in the last half of the year. Do you think 2019 is the year that enterprise embraces blockchain? Why? Absolutely. Early adoption of Enterprise blockchain is already underway in 2018. Companies like IBM have already released and matured their Blockchain offerings for enterprises. 2018 also saw the big behemoth of Cloud Services, Amazon Web Services launching their own Blockchain solutions. We are on the cusp of wider adoption of Blockchain in enterprises in 2019. Key Blockchain challenges in 2019 What do you think the principle challenges in deploying blockchain technology are, and how might developers address them in 2019? There have been two schools that have been emerging about the way blockchain is perceived. One one side, there are people who are pitching Blockchain as some kind of ultimate Utopia, the last solution to solve all of humanity’s problems. And on the other end of the spectrum are people who dismiss Blockchain as another fading trend with nothing substantial to offer. These two kind of schools pose the biggest challenge to the success of Blockchain technology. The truth is somewhere lies in between these two. Developers need to take the job of Blockchain evangelism in their own hands and make sure the right kind of expectations are set up for policy makers and customers. Have the Bitcoin bubble and greater scrutiny from regulators made blockchain projects less feasible, or do they provide a more solid market footing for the technology? Why? Bitcoin has invited lot of scrutiny from regulators and governments, without the bubble too. Bitcoin upends the notion of a nation state controlling the supply of money. So obviously different governments are reacting to it with a wide range of actions, ranging from outright ban from using the existing banking systems to buy and sell Bitcoin and other digital currencies to some countries putting a legal framework in place to securely let their citizens trade in them. The biggest fear they have is the black money being pumped into digital currencies. With proper KYC procedures, these fears can be removed. However, governments and financial institutions are also realizing the advantages Blockchain offer in streamlining their banking and financial markets and are launching pilot projects to adopt Blockchain. Blockchain and disruption in 2019 Will Ethereum continue to dominate the industry or are there new platforms that you think present a serious challenge? Why? Ethereum do have an early mover advantage. However, we know that the early moved advantage is not such a big moat to cross for new competitors. There are likely to be competing and bigger platforms to emerge from the likes of Facebook, Amazon, and IBM that will solve the scalability issues Ethereum faces. What industries do you think blockchain technology is most likely to disrupt in 2019, and why? Finance and Banking are still the biggest industries that will see an explosion of creative products coming out due to the adoption of Blockchain technology. Products for Government use are going to be big especially wherever there is a need for immutable source of truth, like in the case of land records. Do you have any other thoughts on the future of blockchain you’d like to share? We are at a very early stage of Blockchain adoption. It’s very hard to predict right now what kind of killer apps will emerge few years down the line. Nobody predicted smartphones in 2007 will give rise to Apps like Uber. Important thing is to have the right mix of optimism and skepticism.

0
0
15152

article-image-skill-up-2017-what-we-learned-about-tech-pros

Packt

17 Jul 2017

2 min read

Skill Up 2017: What we learned about tech pros and developers

Packt

17 Jul 2017

2 min read

The results are in. 4,731 developers and tech professionals have spoken. And we think you’ll find what they have to say pretty interesting. From the key tools and trends that are disrupting and changing the industry, to learning patterns and triggers, this year’s report takes a bird’s eye view on what’s driving change and what’s impacting the lives of developers and tech pros around the globe in 2017. Here’s the key findings - but download the report to make sure you get the full picture of your peers professional lives. 60% of our respondents have either a ‘reasonable amount of choice’ or a significant ‘amount of choice’ over the tools they use at work - which means that understanding the stack and the best ways to manage it is a key part of any technology professionals knowledge. 28% of respondents believe technical expertise is used either ‘poorly’ or ‘very poorly’ in their organization. Almost half of respondents believe their manager has less technical knowledge than they do. People who work in tech are time poor - 64% of respondents say time is the biggest barrier to their professional development The Docker revolution is crossing disciplines, industries and boundaries - it’s a tool being learned by professionals across industries. Python is the go-to language for a huge number of different job roles - from management to penetration testers. 40% of respondents dedicate time to learning every day - a further 44% dedicate time once a week. Young tech workers are keen to develop the skillset they need to build a career but can find it hard to find the right resources - they also say they lack motivation Big data roles are among the highest paying in the software landscape - demonstrating that organizations are willing to pay big bucks for people with the knowledge and experience. Tools like Kubernetes and Ansible are increasing in popularity - highlighting that DevOps is becoming a methodology - or philosophy - that organizations are starting to adopt. That’s not everything - but it should give you a flavour of the topics that this year’s report touches on. Download this year’s Skill Up report here.

0
0
15100

Tech Guides

article-image-5-mistake-developers-make-when-working-hbase

Tess Hsu

19 Oct 2016

3 min read

5 Mistake Developers Make When Working With HBase

Tess Hsu

19 Oct 2016

3 min read

Having worked with HBase for over six years, I want to share some common mistakes developers make when using HBase: 1. Use aPrefixFilter without setting a start row. This came up several times on the mailing list over the years. Here is the filter: Github The use case is to find rowsthathave a given prefix. Some people complain that the scan was too slow using PrefixFilter. This was due to them not specifying the proper start row.Suppose there are 10K regions in the table, and the first row satisfying the prefix is in the 3000th region. Without a proper start row, the scan begins with the first region. In HBase1.x, you can use the following method of Scan: public Scan setRowPrefixFilter(byte[] rowPrefix) { This setsa start row for you. 2. Incur low free HDFSspace due to HBase snapshots hanging around. In theory, you can have many HBase snapshots in your cluster. This does place a considerable burden on HDFS, and the large number of hfiles may slow down Namenode. Suppose you have a five-column-family table with 40K regions. Each column family has 6 hfiles before compaction kicks in. For this table, you may have 1.2 million hfiles. Take a snapshot to reference the 1.2 million hfiles. After routine compactions, another snapshot is taken, so a million more hfiles (roughly) would be referenced. Prior hfiles stay until the snapshot that references them is deleted. This means that having a practical schedule of cleaning unneeded snapshots is a recipe for satisfactory cluster performance. 3. Retrieve last N rows without using a reverse scan. In some scenarios, you may need to retrieve the last N rows. Assuming salting of keys is not involved, you can use the following API of Scan: public Scan setReversed(boolean reversed) { On the client side, you can choose the proper data structure so that sorting is not needed. For example, use LinkedList 4. Running multiple region servers on the same host due to heap size consideration. Some users run several region servers on the same machine to keep as much data in the block cache as possible, while at the same time minimizing GC time. Compared to having one region server with a huge heap, GC tuning is a lot easier. Deployment has some pain points, because a lot of the start / stop scripts don't work out of the box. With the introduction of bucket cache, GC activities come down greatly. There is no need to use the above trick. See here. 5. Receive a NoNode zookeeper exception due to misconfigured parent znode. When thezookeeper.znode.parentconfig value on the client side doesn't match the one for your cluster, you may see the following exception: Exception in thread "main" org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) atorg.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) atorg.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184) at com.ngdata.sep.util.zookeeper.ZooKeeperImpl.getData(ZooKeeperImpl.java:238) One possible scenario is that hbase-site.xml is not on the classpath of the client application.The default value for zookeeper.znode.parent doesn't match the actual one for your cluster. When you get hbase-site.xml onto the classpath, the problem should be gone. About the author Ted Yu is a staff engineer at HortonWorks. He has also been an HBase committer/PMC for five years. His work on HBase covers various components: security, backup/restore, load balancer, MOB, and so on. He has provided support for customers at eBay, Micron, PayPal, and JPMC. He is also a Spark contributor.

0
0
15024

Tech Guides

article-image-redis-cluster-features-overview-0

Zhe Lin

15 Jan 2016

4 min read

Redis Cluster Features Overview

Zhe Lin

15 Jan 2016

4 min read

After months of developing and testing, Redis 3.0 cluster was released on April 1st, 2015. A Redis Cluster is a set of Redis instances connecting each other with the gossip protocol, and each instance serves an nonoverlapping subset of all caching data. In this post, I'd like to talk about that how users can benefit from it, and also what's the cost of those benefits. The essence of Redis you may already know is that no matter what kinds of structure Redis supports, it is simply a key-value caching utility. Things are the same with Redis Cluster. A Redis Cluster is not something that magically shards your data into different Redis instances separately. The keys are still the unit and not splittable. For example, if you have a list of 100 elements, they will still be stored in one key, in one Redis, no matter how many instances in the cluster. More precisely, Redis Cluster uses CRC16 of a key string mod 16384 as the slot number of the key, and each master Redis instance serves some of the all 16384 slots, so that each instance just takes responsibility for keys in their owning slots. Knowing this you may soon realize that Redis Cluster finally catches up with the multiple cores fashion. As we know, Redis is designed as an asynchronous single-threaded program, which means although it behaves non-blocking. It can, however, use up to only 1 CPU. Since Redis Cluster simply splits keys into different instances by hash and they could serve data simultaneously, as many CPUs as the number of instances in a cluster are possible to be used so that Redis QPS may become much more than a standalone Redis. Another good news is that Redis instances on different hosts can be joined into one cluster, which means the memory a Redis service could use won't be limited to one host machine any longer, and you won't always worry about how much memory Redis may consume three month later because if memory is about to run out, we can extend Redis capacity by starting some more cluster mode instances, joining them into the cluster and doing a reshard. There is also a great news for those who turns on persistence options (RDB or AOF). When a Redis do persistence it will fork before writing data, which probably causes a latency if your dataset is really large. But there is no large thing in a cluster, since it's all sharded, and each instance just persists its own subset. The next advantage you should know is the availability improvement. A Redis Cluster will be much more robust than a standalone Redis, if you deploy a slave for each master Redis. The slaves in cluster mode are different from those in standalone mode, as they can automatically failover its master if its master is disconnected (accidentally killed or network fault, etc). And "the gossip protocol" we mentioned before means there is no central controller in a Redis Cluster, so that if one master is down and replaced by its slave, other masters will tell you who's the new guy to access. Besides the good things Redis Cluster offers to us, we should also take a look at what a cluster cannot do, or do well. The cluster model which Redis chooses sacrifices consistency for availability. It is good enough for a data caching solution. But as a consequence you may soon find some problems with multiple-keys command like MGET, since Redis Cluster requires that all keys manipulated in each operation shall be in one slot (otherwise you'll get a CROSSSLOT error). This restriction is so strong that those operations, not only MGET, MSET, but also EVAL, SUNION, BRPOPLPUSH, etc, are generally unavailable in a cluster. However, if you store all keys in one slot intendedly, the cluster loses it meaning. Another practice to avoid is to store large object intensively, like overwhelmingly huge lists, hashes, sets which are unable to shard. You may break hashes down to individual keys, but therefore you cannot do a HGETALL. You should also think about how to split lists or sets if you want to take advantage of cluster. Those are things you should know about Redis Cluster if you decide to use it. We must say it's a great improvement in availability and performance, as long as you don't the particular multi-keys commands frequently. So, stay with standalone Redis, or proceed to Redis Cluster, it's time to make your choice.

0
0
14972

Tech Guides

article-image-active-learning-an-approach-to-training-machine-learning-models-efficiently

Savia Lobo

27 Apr 2018

4 min read

Active Learning : An approach to training machine learning models efficiently

Savia Lobo

27 Apr 2018

4 min read

Training a machine learning model to give accurate results requires crunching huge amounts of labelled data in it. Data being naturally unlabelled, need ‘experts’ who can scan through the data and tag them with correct labels. To perform topic-specific data labelling, for example, classifying diseases based on their type, would definitely require a doctor or someone with a medical background to label the data. Getting such topic-specific experts to label data can get difficult and quite expensive. Also, doing this for many machine learning projects is impractical. Active learning can help here. What is Active Learning Active learning is a type of semi-supervised machine learning, which aids in reducing the amount of labeled data required to train a model. In active learning, the model focuses only on data that the model is confused about and requests the experts to label them. The model later trains a bit more on the small amount of labeled data, and repeats the same for such confusing data labeling. Active learning, in short, prioritizes confusing samples that need labeling. This enables models to learn faster, and allows experts to skip labeling data that is not a priority, and to provide the model with the most useful information on the confused samples. This in turn can fetch great machine learning models, as active learning can reduce the number of labels required to collect from experts. Types of Active learning An active learning environment includes a learner (the model being trained), huge amount of raw and unlabelled data, and the expert (the person/system labelling the data). The role of the learner is to choose which instances or examples should be labelled. The learner’s goal is to reduce the number of labeled examples needed for an ML model to learn. On the other hand, the expert on receiving the data to be labelled, analyzes the data to determine appropriate labels for it. There are three types of Active learning scenarios. Query Synthesis - In such a scenario, the learner constructs examples, which are further sent to the expert for labeling. Stream-based active learning - Here, from the stream of unlabelled data, the learner decides the instances to be labelled or choose to discard them. Pool-based active learning - This is the most common scenario in active learning. Here, the learner chooses only the most informative or best instances and forwards them to the expert for labelling. Some Real-life applications of Active learning Natural Language Processing (NLP): Most of the NLP applications require a lot of labelled data such as POS (Parts-of-speech) tagging, NER (Named Entity Recognition), and so on. Also, there is a huge cost incurred in labelling this data. Thus, using active learning can reduce the amount of labelled data required to label. Scene understanding in self-driving cars: Active learning can also be used in detecting objects, such as pedestrians from a video camera mounted on a moving car,a key area to ensure safety in autonomous vehicles. This can result in high levels of detection accuracy in complex and variable backgrounds. Drug designing: Drugs are biological or chemical compounds that interact with specific ‘targets’ in the body (usually proteins, RNA or DNA) with an aim to modify their activity. The goal of drug designing is to find which compounds bind to a particular target. The data comes from large collections of compounds, vendor catalogs, corporate collections, and combinatorial chemistry. With active learning, the learner can find out the compounds that are active (binds to target) or inactive. Active learning is still being researched using different deep learning algorithms such as CNNs and LSTMs, which act as learners in order to improve their efficiency. Also, GANs (Generative Adversarial Networks) are being implemented in the active learning framework. There are also some research papers that try to learn active learning strategies using meta-learning. Why is Python so good for AI and Machine Learning? 5 Python Experts Explain AWS Greengrass brings machine learning to the edge Unity Machine Learning Agents: Transforming Games with Artificial Intelligence

0
0
14956

article-image-what-software-stack-does-airbnb-use

Richard Gall

20 Aug 2017

4 min read

What software stack does Airbnb use?

Richard Gall

20 Aug 2017

4 min read

Airbnb is one of the most disruptive organizations of the last decade. Since its inception in 2008, the company has developed a platform that allows people to ‘belong anywhere’ (to quote their own mission statement). In doing so, the very nature of tourism has changed. But what software does Airbnb use? What tools are enabling their level of innovation? How Airbnb develops a dynamic front end Let’s start with the key challenge for Airbnb. Like many similar platforms, one of the central difficulties handling data in a way that’s incredibly dynamic. That means you need to ensure your JavaScript is working hard for you without taking too much strain. That’s where a reactive approach comes in. As an asynchronous paradigm, it’s able to manage how data moves from source to the components that react to it. But the paradigm can only do so much. By using ReactJS, Airbnb have a library that is capable of giving you the necessary dynamism in your UI. The Airbnb team have written a lot on their love for ReactJS, making it their canonical front end framework in 2015. But they’ve also built a large number of other tools around React to make life easier for their engineers. In this post, for example, the team discuss React Sketch.app which ‘allows you to write React components that render to Sketch documents.’ Elsewhere, Ruby also forms an important part of the development stack. However, as with React, the team are committed to innovating with the tools at their disposal. In this post, they discuss how the built a ‘blazing fast thrift bindings for Ruby with C extensions.’ How Airbnb manages data If managing data on the front end has been a crucial part of their software consideration, what about the tools that actually manage and store data? The company use MySQL to manage core business data; this hasn’t been without challenges - not least because of scalability. However, the team have found ways of making MySQL work to their advantage. Redis is also worth a mention here - read here how Airbnb use Redis to monitor customer issues at scale. But Airbnb have always been a big data company at heart - that’s why Hadoop is so important to their data infrastructure. A number of years ago, Airbnb ran Hadoop on Mesos which allows you to deploy a single configuration on different servers; this worked for a while, but owing to a number of challenges, (which you can read about here) the team moved away from Mesos, running a more straightforward Hadoop infrastructure. Spark is also an important tool for Airbnb. The team actually built something called Airstream, which is a computational framework that sits on top of Spark Streaming and Spark SQL, allowing engineers and the data team to get quick insights. Ultimately, for an organization that depends on predictions and machine learning, something like Spark - alongside other open source machine learning libraries - is crucial in the Airbnb stack. Cloud - how Airbnb takes advantage of AWS If you take a close look at how they work, the Airbnb team have a true hacker mentality, where it’s about playing, building, creating new tools to tackle new challenges. This has arguably been enabled by the way they use AWS. It’s perhaps no coincidence that around the time Airbnb was picking up speed and establishing itself that the Amazon cloud offering was reaching maturity. Airbnb adopted a number of AWS services such as S3 and EC2 early on. But the reason Airbnb have stuck with AWS comes down to cultural fit. “For us, an investment in AWS is really about making sure our engineers are focused on the things that are uniquely core to our business. Everything that we do in engineering is ultimately about creating great matches between people,” Kevin Rice, Director of Engineering has said. How Airbnb creates a DevOps culture But there’s more to it than AWS; there’s a real DevOps culture inside Airbnb that further facilitates a mixture of agility and creativity. The tools used for DevOps are an interesting mix - some of which are unsurprising - like GitHub, and Nginx (which powers some of the busiest sites on the planet), but some slightly more surprising features, such as Kibana, which is used by the company to monitor data alongside Elasticsearch. When it comes to developing and provisioning environments, Airbnb use Vagrant and Chef. It’s easy to see the benefits here - it makes setting up and configuring environments incredibly easy and fast. And if you’re going to live by the principles of DevOps, this is essential - it’s the foundation of everything you do.

0
0
14878

Tech Guides

article-image-python-oop-python-object-oriented-programming

Liz Tom

25 Apr 2016

5 min read

Python - OOP! (Python: Object-Oriented Programming)

Liz Tom

25 Apr 2016

5 min read

Or Currency Conversion using Python I love to travel and one of my favorite programming languages is Python. Sometimes when I travel I like to make things a bit more difficult and instead of just asking Google to convert currency for me, I like to have a script on my computer that requires me to know the conversion rate and then calculate my needs. But seriously, let's use a currency converter to help explain some neat reasons why Object Oriented Programming is awesome. Money Money Money First let's build a currency class. class Currency(object): def __init__(self, country, value): self.country = country self.value = float(value) OK, neato. Here's a currency class. Let's break this down. In Python every class has an __init__ method. This is how we build an instance of a class. If I call Currency(), it's going to break because our class also happens to require two arguments. self is not required to be passed. In order to create an instance of currency we just use Currency('Canada', 1.41) and we've now got a Canadian instance of our currency class. Now let's add some helpful methods onto the Currency class. def from_usd(self, dollars): """If you provide USD it will convert to foreign currency """ return self.value * int(dollars)def to_usd(self, dollars): """If you provide foreign currency it will convert to USD """ return int(dollars) / self.value Again, self isn't needed by us to use the methods but needs to be passed to every method in our class. self is our instance. In some cases self will refer to our Canadian instance but if we were to create a new instance Currency('Mexico', 18.45) self can now also refer to our Mexican instance. Fun. We've got some awesome methods that help me do math without me having to think. How does this help us? Well, we don't have to write new methods for each country. This way, as currency rates change, we can update the changes rather quickly and also deal with many countries at once. Conversion between USD and foreign currency is all done the same way. We don't need to change the math based on the country we're planning on visiting, we only need to change the value of the currency relative to USD. I'm an American so I used USD because that's the currency I'd be converting to and from most often. But if I wanted, I could have named them from_home_country and to_home_country. Now how does this work? Well, if I wanted to run this script I'd just do this: again = True while again: country = raw_input('What country are you going to?n') value = float(raw_input('How many of their dollars equal 1 US dollarn')) foreign_country = Currency(country, value) convert = raw_input('What would you like to convert?n1. To USDn2. To %s dollarsn' % country) dollars = raw_input('How many dollars would you like to convert?n') if( convert == '1' ): print dollars + ' ' + country + ' dollars are worth ' + str(foreign_country.to_usd(dollars)) + ' US dollarsn' elif( convert == '2' ): print dollars + ' US dollars are worth ' + str(foreign_country.from_usd(dollars)) + dollars + ' ' + country again = raw_input('nnnWant to go again? (Y/N)n') if( again == 'y' or again == 'Y' ): again = True elif( again == 'n' or again == 'N' ): again = False **I'm still using Python 2 so if you're using Python 3 you'll want to change those raw_inputs to just input. This way we can convert as much currency as we want between USD and any country! I can now travel the world feeling comfortable that if I can't access the Internet and I happen to have my computer nearby and I am at a bank or the hotel lobby staring at the exchange rate board, I'll be able to convert currency with ease without having to remember which way converts my money to USD and which way converts USD to Canadian dollars. Object-oriented programming allows us to create objects that all behave in the same way but store different values, like a blue car, red car, or green car. The cars all behave the same way but they are all described differently. They might all have different MPG but the way we calculate their MPG is the same. They all have four wheels and an engine. While it can be harder to build your program with object oriented design in mind, it definitely helps with maintainability in the long run. About the Author Liz Tom is a Software Developer at Pop Art, Inc in Portland, OR. Liz’s passion for full stack development and digital media makes her a natural fit at Pop Art. When she’s not in the office, you can find Liz attempting parkour and going to check out interactive displays at museums.

0
0
14872

Ryan Richard

22 Jun 2015

6 min read

Hands on with Kubernetes

Ryan Richard

22 Jun 2015

6 min read

In February I wrote a high level overview of the primary Kubernetes features. In this blog post, we’ll actively use all of these features to deploy a simple 2-tier application inside of a Kubernetes cluster. I highly recommend reading the intro blog before getting started. Setup The easiest way to deploy a cluster is to use the Google Container Engine, which is available on your Google Compute Engine account. If you don’t have an account, you may use one of the available Getting Started guides in the official Github repository. One of the great things about Kubernetes is that it will function almost identically regardless of where it’s deployed with the exception of some cloud provider integrations. I’ve created a small test cluster on GCE, which resulted in three instances being created. I’ve also added my public SSH key to master node so that I may log in via SSH and use the kubectl command locally. kubectl is the CLI for Kubernetes and you can also install it locally on your workstation if you prefer. My demo application is a small python based app that leverages redis as a backend. The source is available here. It expects Docker style environment variables for to point to the redis server and will purposely throw a 5XX status code if there are issues reaching the database. Walkthrough First we’re going to change the Kubernetes configuration to allow privileged containers. This is only being done for demo purposes and shouldn’t be used in a production environment if you can avoid it. This is for the logging container we’ll be deploying with the application. SSH into the master instance. Run the following commands to update the salt configuration sudo sed -i 's/false/true/' /srv/pillar/privilege.sls sudo salt '*' saltutil.refresh_pillar sudo salt-minion Reboot your non-master nodes to force the salt changes. On the master create a redis-master.yaml file with the following content once the nodes are back online: id: redis-master kind: Pod apiVersion: v1beta1 labels: name: redis-master desiredState: manifest: version: v1beta1 id: redis-master containers: - name: redis-master image: dockerfile/redis ports: - containerPort: 6379 I’m using a Pod as opposed to a replicationController since this is a stateful service and it would not be appropriate to run multiple redis nodes in this scenario. Once ready, instruct kubenetes to deploy the container: kubectl create -f redis-master.yaml kubectl get pods Create a redis-service.yaml with the following: kind: Service apiVersion: v1beta1 id: redis port: 6379 selector: name: redis-master containerPort: 6379 kubectl create –f redis-service.yaml kubectl get services Notice that I’m hard coding the service port to match the standard redis port of 6379. Making these match isn’t required as so long as the containerPort is correct. Under the hood, creating a service causes a new iptables entry to be created on each node. The entries will automatically redirect traffic to a port locally where kube-proxy is listening. Kube-proxy is in turn aware of where my redis-master container is running and will proxy connections for me. To prove this works, I’ll connect to redis via my local address (127.0.0.1:60863) which does not have redis running and I’ll get a proper connection to my database which is on another machine: Seeing as that works, let’s get back to the point at hand and deploy our application. Write a demoapp.yaml file with the following content: id: frontend-controller apiVersion: v1beta1 kind: ReplicationController labels: name: frontend-controller desiredState: replicas: 2 replicaSelector: name: demoapp podTemplate: labels: name: demoapp desiredState: manifest: id: demoapp version: v1beta3 containers: - name: frontend image: doublerr/redis-demo ports: - containerPort: 8888 hostPort: 80 - name: logentries privileged: true command: - "--no-stats" - "-l" - "<log token>" - "-j" - "-t" - "<account token>" - "-a app=demoapp" image: logentries/docker-logentries volumeMounts: - mountPath: /var/run/docker.sock name: dockersock readOnly: true volumes: - name: dockersock source: hostDir: path: /var/run/docker.sock In the above description, I’m grouping 2 containers based on my redis-demo image and the logentries image respectively. I wanted to show the idea of sidecar containers, which are containers deployed alongside of the primary container and whose job is to support the primary container. In the above case, the sidecar forwards logs to my logentries.com account tagged with name of my app. If you’re following along you can sign up for a free logentries account to test this out. You’ll need to create a new log, retrieve the log token and account token first. You can then replace the <log token> and <account token> in the yaml file with your values. Deploy the application kubectl create -f demoapp.yaml kubectl get pods If your cloud provider is blocking port 80 traffic, make sure to allow it directly to your nodes and you should be able to see the app running in a browser once the pod status is “Running”. Co-locating Containers Co-locating containers is a powerful concept worth spending some time talking about. Since Kubernetes guarantees co-located containers be run together, my primary container doesn’t need to be aware of anything beyond running the application. In this case logging is dealt with separately. If I want to switch logging services, I just need to redeploy the app with a new sidecar container that is able to send the logs elsewhere. Imagine doing this for monitoring, application content updates, etc . You can really see the power of co-locating containers together. On a side note the logentries image isn’t perfectly suited for this methodology. It’s designed such that you should run 1 of these containers per docker host and it will forward all container logs upstream. It also requires access to the docker socket on the host. A better design for Kubernetes paradigm would be a container that only collects STDOUT and STDERR for the container it’s attached to. The logentries image works for this proof of concept though and I can see errors in my account: In closing, Kubernetes is fun to deploy applications into especially if you start thinking of how best to leverage group containers. Most stateless applications will want to leverage the ReplicationController instead of a single pod and services help tie everything together. For more Docker tutorials, insight and analysis, visit our dedicated Docker page. About the Author Ryan Richard is a systems architect at Rackspace with a background in automation and OpenStack. His primary role revolves around research and development of new technologies. He added the initial support for the Rackspace Cloud into the Kubernetes codebase. He can be reached at: @rackninja on Twitter.

0
0
14804

Tech Guides

article-image-open-source-software-are-maintainers-the-only-ones-responsible-for-software-sustainability

Savia Lobo

01 Dec 2018

6 min read

Open Source Software: Are maintainers the only ones responsible for software sustainability?

Savia Lobo

01 Dec 2018

6 min read

Last week, a Californian Computer Scientist disclosed a malicious package ‘flatmap-stream’ in the popular npm package, ‘event-stream’. The reason for this breach is, the ownership of the event-stream package was transferred by Dominic Tarr (original author) to a malicious user, right9ctrl. Following this, many Twitter and GitHub users have supported him whereas the others think he should have been more careful while transferring package ownership. Andre Staltz, an open source hacker mentions in a support to Dominic, “The fact that he gave ownership meant that he *cared* at least to do a tiny action that seemed ok. Not caring would be doing absolutely nothing at all, and that's the case quite often, and OSS maintainers get criticized also for *that*” Who’s responsible for maintaining the open source software? At the NDC Sydney 2018 conference held in September, two open source maintainers Nick Randolph, Technical Lead at Built To Roam and Geoffrey Huntley, an open source software engineer talked on why should companies and people should contribute back to open source and how they can do it. However, if something goes wrong with the project, who is responsible for it? Most users blame the maintainers of the project, but the license does not say so. In fact users, contributors, and maintainers together are equally responsible. Open source is a fantastic avenue for personal development as it does not require the supply, material, planning, and approval like other software Some reasons to contribute to Open Source Software: Other people will help you for free You will save a lot on training and documentation You will not be criticized by open source advocates Ability to hire best engineers You will be able to influence the direction of the projects to which you contribute Companies have embraced open source software as it allows them to get solutions to the market faster for their customers. It has allowed companies to focus on delivering business value instead of low-level technical tasks. The problem with Open Source The majority of open-source software that the world depends on is built by volunteers. When a business chooses to use open-source software this volunteer labor is essentially an unpaid vendor with no contractual obligations. However the speakers say, “Historically, we have defined open-source software in terms of freedom for the consumer, in the future now that open-source has ‘won’ this dialogue needs to change. Did we get it right? Did we ever stop to think about how software is maintained, the rights of maintainers and the cost of maintenance?” The maintainers said, as per the Open Source Software license, once the software is released to the world their responsibility ends. They need not respond to GitHub issues, no need to create documentation, no need to answer questions on stack overflow, and so on. The popular example where a security damage was caused by the popular Heartbleed Bug where the security issue was found in the OpenSSL cryptographic software library, which caused a huge loss of revenue. However, when an OSS breaks or users need new features, they log an issue on GitHub and then sit back awaiting a response. If the comments are not addressed by the maintainer, users start complaining about how badly the project is run. The thing about OSS that's too often forgotten, it's AS-IS, no exceptions. How should Businesses secure their supply chain? Different projects may operate differently, with more or fewer people, with work being prioritized differently, on differing release schedules but in all cases the software delivered is as-is, meaning that there is absolutely no SLA. The speakers say that it businesses should analyze the level of contribution they need to make towards the open source community. They have highlighted that in order to secure their supply chain, users should contribute with money or time. The truth is that free software is not really free. How much is this going to cost in man hours? If not with money, they can contribute with time. For instance, there is an initiative called as opensourcefriday.com and as an engineering leader you or your employees can pull request and learn how the open source they depend upon works. This means you are having a positive influence in the community and also contributing back to open source. And if your company faces any critical issue, the maintainer is likely to help you as you have actively contributed to the community. Source: YouTube How do you know how much to contribute? In order to shift the goal of the software, you have to be the maintainer or a core contributor to influence the direction. If you just want to protect the supply chain, you can simply fix what’s broken. If you wish to contribute at a consistent velocity, contribute at a rate that you can maintain for as long as you want. Source: YouTube According to Nick and Geoffrey what users and businesses should do is: Protect their software chain and see that from a business perspective what are the components I am making use of and make sure that these components are going to exist, going forward. We also need to think about the sustainability of the project and let it not wither away soon. If the project is good for the community, how can we make it sustainable by making more and more people joining the project? Companies should also keep a track of what they are contributing back to these projects. People should share their experiences and their best practices. This contribution will help analyze the risk factors. Share so that the industry matures beyond simple security concerns. Watch the complete talk by Nick and Geoffrey on YouTube https://www.youtube.com/watch?v=Mm_RuObpeGo&app=desktop The Linux and RISC-V foundations team up to drive open source development and adoption of RISC-V instruction set architecture (ISA) OpenStack Foundation to tackle open source infrastructure problems, will conduct conferences under the name ‘Open Infrastructure Summit’ The Ceph Foundation has been launched by the Linux Foundation to support the open source storage project

0
0
14752

article-image-black-friday-17-ways-ecommerce-machine-learning

Sugandha Lahoti

24 Nov 2017

10 min read

Black Friday Special: 17 ways in 2017 that online retailers use machine learning

Sugandha Lahoti

24 Nov 2017

10 min read

Black Friday sales are just around the corner. Both online and traditional retailers have geared up to race past each other in the ultimate shopping frenzy of the year. Although both brick and mortar retailers and online platforms will generate high sales, online retailers will sweep past the offline platforms. Why? In case of online retailers, the best part remains the fact that shopping online, customers don’t have to deal with pushy crowds, traffic, salespeople, and long queues. Online shoppers have access to a much larger array of products. They can also switch between stores, by just switching between tabs on their smart devices. Considering the surge of shoppers expected on such peak seasons, Big Data Analytics is a helpful tool for online retailers. With the advances in Machine Learning, Big data analytics is no longer confined to the technology landscape, it also represents a way how retailers connect with consumers in a purposeful way. For retailers, both big and small, adopting the right ML powered Big Data Analytic strategy would help in increasing their sales, retent their customers and generate high revenues. Here are 17 reasons why data is an important asset for retailers, especially on the 24th, this month. A. Improving site infrastructure The first thing that a customer sees when landing on an e-commerce website is the UI, ease of access, product classification, number of filters, etc. Hence building an easy to use website is paramount. Here’s how ML powered Big data analytics can help: [toggle title="" state="close"] 1. E-commerce site analysis A complete site analysis is one of the ways to increase sales and retain customers. By analyzing page views and actual purchases, bounce rates, and least popular products, the e-commerce website can be altered for better usability. For enhancing website features data mining techniques can also be used. This includes web mining, which is used to extract information from the web, and log files, which contain information about the user. For time-bound sales like Black Friday and Cyber Monday, this is quite helpful for better product placement, removing unnecessary products and showcasing products which cater to a particular user base. 2. Generating test data Generation of test data helps in a deeper analysis which helps in increasing sales. Big data analytics can give a helping hand here by organizing products based upon the type, shopper gender and age group, brands, pricing, number of views of each product page, and the information provided for that product. During peak seasons such as Black Friday, ML powered data analytics can analyze most visited pages and shopper traffic flow for better product placements and personalized recommendations.[/toggle] B. Enhancing Products and Categories Every retailer in the world is looking for ways to reduce costs without sacrificing the quality of their products. Big data analytics in combination with machine learning is of great help here. [toggle title="" state="close"] 3. Category development Big Data analytics can help in building up of new product categories, or in eliminating or enhancing old ones. This is possible by using machine learning techniques to analyze patterns in the marketing data as well as other external factors such as product niches. ML powered assortment planning can help in selecting and planning products for a specified period of time, such as the Thanksgiving week, so as to maximize sales and profit. Data analytics can also help in defining Category roles in order to clearly define the purpose of each category in the total business lifecycle. This is done to ensure that efforts made around a particular category, actually contribute to category development. It also helps to identify key categories, which are the featured products that specifically meet an objective for e.g. Healthy food items, Cheap electronics, etc. 4. Range selection An optimum and dynamic product range is essential to retain customers. Big data analytics can utilize sales data and shopper history to measure a product range for maximum profitability. This is especially important for Black Friday and Cyber Monday deals where products are sold at heavily discounted rates. 5. Inventory management Data analytics can give an overview of best selling products, non-performing or slow moving products, seasonal products and so on. These data pointers can help retailers manage their inventory and reduce the associated costs. Machine learning powered Big data analytics are also helpful in making product localization strategies i.e. which product sells well in what areas. In order to localize for China, Amazon changed its China branding to Amazon.cn. To make it easy for Chinese to pay, Amazon China introduced portable POS so users can pay the delivery guy via credit card at their doorstep. 6. Waste reduction Big Data analytics can analyze sales and reviews to identify products which don’t do well, and either eliminate the product or combine them with a companion well-doing product to increase its sales. Analysing data can also help in listing products that were returned due to damages and defects. Generating insights from this data using machine learning models can be helpful to retailers in many ways. Some examples are: they can modify their stocking methods, improve on their packaging and logistic support for those kinds of products. 7. Supply chain optimization Big Data analytics also have a role to play in Supply chain optimization. This includes using sales and forecast data to plan and manage goods from retailers to warehouses to transport, onto the doorstep of customers. Top retailers like Amazon, are offering deals under the Black Friday space for the entire week. Expanding the sale window is a great supply chain optimization technique for a more manageable selling.[/toggle] C. Upgrading the Customer experience Customers are the most important assets for any retailer. Big Data analytics is here to help you retain, acquire, and attract your customers. [toggle title="" state="close"] 8. Shopper segmentation Machine learning techniques can link and analyze granular data such as behavioral, transactional and interaction data to identify and classify customers who behave in similar ways. This eliminates the guesswork associated and helps in creating rich and highly dynamic consumer profiles. According to a report by Research Methodology, Walmart uses a mono-segment type of positioning targeted to single customer segment. Walmart also pays attention to young consumers due to the strategic importance of achieving the loyalty of young consumers for long-term perspectives. 9. Promotional analytics An important factor for better sales is analyzing how customers respond to promotions and discount. Analyzing data on an hour-to-hour basis on special days such as Black Friday or Cyber Monday, which have high customer traffic, can help retailers plan for better promotions and lead to brand penetration. The Boston consulting group uses data analytics to accurately gauge the performance of promotions and predict promotion performance in advance. 10. Product affinity models By analyzing a shopper’s past transaction history, product affinity models can track customers with the highest propensity of buying a particular product. Retailers can then use this for attracting more customers or providing the existing ones with better personalizations. Product affinity models can also cluster products that are mostly bought together, which can be used to improve recommendation systems. 11. Customer churn prediction The massive quantity of customer data being collected can be used for predicting customer churn rate. Customer churn prediction is helpful in retaining customers, attracting new ones, and also acquiring the right type of customers in the first place. Classification models such as Logistic regression can be used to predict customers most likely to churn. As part of the Azure Machine Learning offering, Microsoft has a Retail Customer Churn Prediction Template to help retail companies predict customer churns.[/toggle] D. Formulating and aligning business strategies Every retailer is in need of tools and strategies for a product or a service to reach and influence the consumers, generate profits, and contribute to the long-term success of the business. Below are some pointers depicting how ML powered Big Data Analytics can help retailers do just that. [toggle title="" state="close"] 12. Building dynamic pricing models Pricing models can be designed by looking at the customer’s purchasing habits and surfing history. This descriptive analytics can be fed into a predictive model to obtain an optimal pricing model such as price sensitivity scores, and price to demand elasticity. For example, Amazon uses a dynamic price optimization technique by offering its biggest discounts on its most popular products, while making profits on less popular ones. IBM’s Predictive Customer Intelligence can dynamically adjust the price of a product based on customer’s purchase decision. 13. Time series analysis Time series analysis can be used to identify patterns and trends in customer purchases, or a product’s lifecycle by observing information in a sequential fashion. It can also be used to predict future values based on the sequence so generated. For online retailers this means using historical sales data to forecast future sales, analyzing time-dependent patterns to list new arrivals, mark up prices or lower them down depending events such as Black Friday or Cyber Monday sales etc. 14. Demand forecasting Machine learning powered Big Data analytics can learn demand levels from a wide array of factors such as product nature, characteristics, seasonality, relationships with other associated products, relationship with other market factors, etc. It can then forecast the type of demand associated with a particular product using a simulation model. Such predictive analytics are highly accurate and also reduce costs especially for events like Black Friday, where there is a high surge of shoppers. 15. Strategy Adjustment Predictive Big Data analytics can help shorten the go-to-market time for product launches, allowing marketers to adjust their strategy midcourse if needed. For Black Friday or Cyber Monday deals, an online retailer can predict the demand for a particular product and can amend strategies in between, such as increasing the discount, or placing a product at the discounted rate for a longer time, etc. 16. Reporting and sales analysis Big data analytics tools can analyze large quantities of retail data quickly. Also, most such tools have a simple UI Dashboard which helps retailers know detailed descriptions of their queries in a single click. Thus a lot of time is saved, which was previously used for creating reports or sales summary. Reports generated from a data analytics tool are quick, fast, and easy to understand. 17. Marketing mix spend optimization Forecasting sales and proving ROI of marketing activities are two pain points faced by most retailers. Marketing Mix Modelling is a big data statistical analysis, which uses historical data to show the impact of marketing activities on sales and then forecasts the impact of future marketing tactics. Insights derived from such tools can be used to enhance marketing strategies and optimize the costs.[/toggle] Adopting the strategies as mentioned above, retailers can maximize their gains this holiday season starting with Black Friday which begins as the clock chimes 12 today. Machine Powered Big Data analytics is there to help retailers attract new shoppers, retain them, enhance product line, define new categories, and formulate and align business strategies. Gear up for a Big Data Black Friday this 2017!

0
0
14751

Do you need to be a polyglot to be a great programmer?

13 reasons why Exit Polls get it wrong sometimes

GoMobile: GoLang's Foray into the Mobile World

4 surprising things from Stack Overflow's 2018 survey

Technical debt is damaging businesses

Is Blockchain a failing trend or can it build a better world? Harish Garg provides his insight [Interview]

Skill Up 2017: What we learned about tech pros and developers

5 Mistake Developers Make When Working With HBase

Redis Cluster Features Overview

Active Learning : An approach to training machine learning models efficiently

Trending Topics

What software stack does Airbnb use?

Python - OOP! (Python: Object-Oriented Programming)

Hands on with Kubernetes

Open Source Software: Are maintainers the only ones responsible for software sustainability?

Black Friday Special: 17 ways in 2017 that online retailers use machine learning

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access