Design Patterns | 0 articles | Tech News, Tutorials & Expert Insights

Dealing with Legacy Code

Packt

31 Mar 2015

16 min read

In this article by Arun Ravindran, author of the book Django Best Practices and Design Patterns, we will discuss the following topics: Reading a Django code base Discovering relevant documentation Incremental changes versus full rewrites Writing tests before changing code Legacy database integration (For more resources related to this topic, see here.) It sounds exciting when you are asked to join a project. Powerful new tools and cutting-edge technologies might await you. However, quite often, you are asked to work with an existing, possibly ancient, codebase. To be fair, Django has not been around for that long. However, projects written for older versions of Django are sufficiently different to cause concern. Sometimes, having the entire source code and documentation might not be enough. If you are asked to recreate the environment, then you might need to fumble with the OS configuration, database settings, and running services locally or on the network. There are so many pieces to this puzzle that you might wonder how and where to start. Understanding the Django version used in the code is a key piece of information. As Django evolved, everything from the default project structure to the recommended best practices have changed. Therefore, identifying which version of Django was used is a vital piece in understanding it. Change of Guards Sitting patiently on the ridiculously short beanbags in the training room, the SuperBook team waited for Hart. He had convened an emergency go-live meeting. Nobody understood the "emergency" part since go live was at least 3 months away. Madam O rushed in holding a large designer coffee mug in one hand and a bunch of printouts of what looked like project timelines in the other. Without looking up she said, "We are late so I will get straight to the point. In the light of last week's attacks, the board has decided to summarily expedite the SuperBook project and has set the deadline to end of next month. Any questions?" "Yeah," said Brad, "Where is Hart?" Madam O hesitated and replied, "Well, he resigned. Being the head of IT security, he took moral responsibility of the perimeter breach." Steve, evidently shocked, was shaking his head. "I am sorry," she continued, "But I have been assigned to head SuperBook and ensure that we have no roadblocks to meet the new deadline." There was a collective groan. Undeterred, Madam O took one of the sheets and began, "It says here that the Remote Archive module is the most high-priority item in the incomplete status. I believe Evan is working on this." "That's correct," said Evan from the far end of the room. "Nearly there," he smiled at others, as they shifted focus to him. Madam O peered above the rim of her glasses and smiled almost too politely. "Considering that we already have an extremely well-tested and working Archiver in our Sentinel code base, I would recommend that you leverage that instead of creating another redundant system." "But," Steve interrupted, "it is hardly redundant. We can improve over a legacy archiver, can't we?" "If it isn't broken, then don't fix it", replied Madam O tersely. He said, "He is working on it," said Brad almost shouting, "What about all that work he has already finished?" "Evan, how much of the work have you completed so far?" asked O, rather impatiently. "About 12 percent," he replied looking defensive. Everyone looked at him incredulously. "What? That was the hardest 12 percent" he added. O continued the rest of the meeting in the same pattern. Everybody's work was reprioritized and shoe-horned to fit the new deadline. As she picked up her papers, readying to leave she paused and removed her glasses. "I know what all of you are thinking... literally. But you need to know that we had no choice about the deadline. All I can tell you now is that the world is counting on you to meet that date, somehow or other." Putting her glasses back on, she left the room. "I am definitely going to bring my tinfoil hat," said Evan loudly to himself. Finding the Django version Ideally, every project will have a requirements.txt or setup.py file at the root directory, and it will have the exact version of Django used for that project. Let's look for a line similar to this: Django==1.5.9 Note that the version number is exactly mentioned (rather than Django>=1.5.9), which is called pinning. Pinning every package is considered a good practice since it reduces surprises and makes your build more deterministic. Unfortunately, there are real-world codebases where the requirements.txt file was not updated or even completely missing. In such cases, you will need to probe for various tell-tale signs to find out the exact version. Activating the virtual environment In most cases, a Django project would be deployed within a virtual environment. Once you locate the virtual environment for the project, you can activate it by jumping to that directory and running the activated script for your OS. For Linux, the command is as follows: $ source venv_path/bin/activate Once the virtual environment is active, start a Python shell and query the Django version as follows: $ python >>> import django >>> print(django.get_version()) 1.5.9 The Django version used in this case is Version 1.5.9. Alternatively, you can run the manage.py script in the project to get a similar output: $ python manage.py --version 1.5.9 However, this option would not be available if the legacy project source snapshot was sent to you in an undeployed form. If the virtual environment (and packages) was also included, then you can easily locate the version number (in the form of a tuple) in the __init__.py file of the Django directory. For example: $ cd envs/foo_env/lib/python2.7/site-packages/django $ cat __init__.py VERSION = (1, 5, 9, 'final', 0) ... If all these methods fail, then you will need to go through the release notes of the past Django versions to determine the identifiable changes (for example, the AUTH_PROFILE_MODULE setting was deprecated since Version 1.5) and match them to your legacy code. Once you pinpoint the correct Django version, then you can move on to analyzing the code. Where are the files? This is not PHP One of the most difficult ideas to get used to, especially if you are from the PHP or ASP.NET world, is that the source files are not located in your web server's document root directory, which is usually named wwwroot or public_html. Additionally, there is no direct relationship between the code's directory structure and the website's URL structure. In fact, you will find that your Django website's source code is stored in an obscure path such as /opt/webapps/my-django-app. Why is this? Among many good reasons, it is often more secure to move your confidential data outside your public webroot. This way, a web crawler would not be able to accidentally stumble into your source code directory. Starting with urls.py Even if you have access to the entire source code of a Django site, figuring out how it works across various apps can be daunting. It is often best to start from the root urls.py URLconf file since it is literally a map that ties every request to the respective views. With normal Python programs, I often start reading from the start of its execution—say, from the top-level main module or wherever the __main__ check idiom starts. In the case of Django applications, I usually start with urls.py since it is easier to follow the flow of execution based on various URL patterns a site has. In Linux, you can use the following find command to locate the settings.py file and the corresponding line specifying the root urls.py: $ find . -iname settings.py -exec grep -H 'ROOT_URLCONF' {} ; ./projectname/settings.py:ROOT_URLCONF = 'projectname.urls' $ ls projectname/urls.py projectname/urls.py Jumping around the code Reading code sometimes feels like browsing the web without the hyperlinks. When you encounter a function or variable defined elsewhere, then you will need to jump to the file that contains that definition. Some IDEs can do this automatically for you as long as you tell it which files to track as part of the project. If you use Emacs or Vim instead, then you can create a TAGS file to quickly navigate between files. Go to the project root and run a tool called Exuberant Ctags as follows: find . -iname "*.py" -print | etags - This creates a file called TAGS that contains the location information, where every syntactic unit such as classes and functions are defined. In Emacs, you can find the definition of the tag, where your cursor (or point as it called in Emacs) is at using the M-. command. While using a tag file is extremely fast for large code bases, it is quite basic and is not aware of a virtual environment (where most definitions might be located). An excellent alternative is to use the elpy package in Emacs. It can be configured to detect a virtual environment. Jumping to a definition of a syntactic element is using the same M-. command. However, the search is not restricted to the tag file. So, you can even jump to a class definition within the Django source code seamlessly. Understanding the code base It is quite rare to find legacy code with good documentation. Even if you do, the documentation might be out of sync with the code in subtle ways that can lead to further issues. Often, the best guide to understand the application's functionality is the executable test cases and the code itself. The official Django documentation has been organized by versions at https://docs.djangoproject.com. On any page, you can quickly switch to the corresponding page in the previous versions of Django with a selector on the bottom right-hand section of the page: In the same way, documentation for any Django package hosted on readthedocs.org can also be traced back to its previous versions. For example, you can select the documentation of django-braces all the way back to v1.0.0 by clicking on the selector on the bottom left-hand section of the page: Creating the big picture Most people find it easier to understand an application if you show them a high-level diagram. While this is ideally created by someone who understands the workings of the application, there are tools that can create very helpful high-level depiction of a Django application. A graphical overview of all models in your apps can be generated by the graph_models management command, which is provided by the django-command-extensions package. As shown in the following diagram, the model classes and their relationships can be understood at a glance: Model classes used in the SuperBook project connected by arrows indicating their relationships This visualization is actually created using PyGraphviz. This can get really large for projects of even medium complexity. Hence, it might be easier if the applications are logically grouped and visualized separately. PyGraphviz Installation and Usage If you find the installation of PyGraphviz challenging, then don't worry, you are not alone. Recently, I faced numerous issues while installing on Ubuntu, starting from Python 3 incompatibility to incomplete documentation. To save your time, I have listed the steps that worked for me to reach a working setup. On Ubuntu, you will need the following packages installed to install PyGraphviz: $ sudo apt-get install python3.4-dev graphviz libgraphviz-dev pkg-config Now activate your virtual environment and run pip to install the development version of PyGraphviz directly from GitHub, which supports Python 3: $ pip install git+http://github.com/pygraphviz/pygraphviz.git#egg=pygraphviz Next, install django-extensions and add it to your INSTALLED_APPS. Now, you are all set. Here is a sample usage to create a GraphViz dot file for just two apps and to convert it to a PNG image for viewing: $ python manage.py graph_models app1 app2 > models.dot $ dot -Tpng models.dot -o models.png Incremental change or a full rewrite? Often, you would be handed over legacy code by the application owners in the earnest hope that most of it can be used right away or after a couple of minor tweaks. However, reading and understanding a huge and often outdated code base is not an easy job. Unsurprisingly, most programmers prefer to work on greenfield development. In the best case, the legacy code ought to be easily testable, well documented, and flexible to work in modern environments so that you can start making incremental changes in no time. In the worst case, you might recommend discarding the existing code and go for a full rewrite. Or, as it is commonly decided, the short-term approach would be to keep making incremental changes, and a parallel long-term effort might be underway for a complete reimplementation. A general rule of thumb to follow while taking such decisions is—if the cost of rewriting the application and maintaining the application is lower than the cost of maintaining the old application over time, then it is recommended to go for a rewrite. Care must be taken to account for all the factors, such as time taken to get new programmers up to speed, the cost of maintaining outdated hardware, and so on. Sometimes, the complexity of the application domain becomes a huge barrier against a rewrite, since a lot of knowledge learnt in the process of building the older code gets lost. Often, this dependency on the legacy code is a sign of poor design in the application like failing to externalize the business rules from the application logic. The worst form of a rewrite you can probably undertake is a conversion, or a mechanical translation from one language to another without taking any advantage of the existing best practices. In other words, you lost the opportunity to modernize the code base by removing years of cruft. Code should be seen as a liability not an asset. As counter-intuitive as it might sound, if you can achieve your business goals with a lesser amount of code, you have dramatically increased your productivity. Having less code to test, debug, and maintain can not only reduce ongoing costs but also make your organization more agile and flexible to change. Code is a liability not an asset. Less code is more maintainable. Irrespective of whether you are adding features or trimming your code, you must not touch your working legacy code without tests in place. Write tests before making any changes In the book Working Effectively with Legacy Code, Michael Feathers defines legacy code as, simply, code without tests. He elaborates that with tests one can easily modify the behavior of the code quickly and verifiably. In the absence of tests, it is impossible to gauge if the change made the code better or worse. Often, we do not know enough about legacy code to confidently write a test. Michael recommends writing tests that preserve and document the existing behavior, which are called characterization tests. Unlike the usual approach of writing tests, while writing a characterization test, you will first write a failing test with a dummy output, say X, because you don't know what to expect. When the test harness fails with an error, such as "Expected output X but got Y", then you will change your test to expect Y. So, now the test will pass, and it becomes a record of the code's existing behavior. Note that we might record buggy behavior as well. After all, this is unfamiliar code. Nevertheless, writing such tests are necessary before we start changing the code. Later, when we know the specifications and code better, we can fix these bugs and update our tests (not necessarily in that order). Step-by-step process to writing tests Writing tests before changing the code is similar to erecting scaffoldings before the restoration of an old building. It provides a structural framework that helps you confidently undertake repairs. You might want to approach this process in a stepwise manner as follows: Identify the area you need to make changes to. Write characterization tests focusing on this area until you have satisfactorily captured its behavior. Look at the changes you need to make and write specific test cases for those. Prefer smaller unit tests to larger and slower integration tests. Introduce incremental changes and test in lockstep. If tests break, then try to analyze whether it was expected. Don't be afraid to break even the characterization tests if that behavior is something that was intended to change. If you have a good set of tests around your code, then you can quickly find the effect of changing your code. On the other hand, if you decide to rewrite by discarding your code but not your data, then Django can help you considerably. Legacy databases There is an entire section on legacy databases in Django documentation and rightly so, as you will run into them many times. Data is more important than code, and databases are the repositories of data in most enterprises. You can modernize a legacy application written in other languages or frameworks by importing their database structure into Django. As an immediate advantage, you can use the Django admin interface to view and change your legacy data. Django makes this easy with the inspectdb management command, which looks as follows: $ python manage.py inspectdb > models.py This command, if run while your settings are configured to use the legacy database, can automatically generate the Python code that would go into your models file. Here are some best practices if you are using this approach to integrate to a legacy database: Know the limitations of Django ORM beforehand. Currently, multicolumn (composite) primary keys and NoSQL databases are not supported. Don't forget to manually clean up the generated models, for example, remove the redundant 'ID' fields since Django creates them automatically. Foreign Key relationships may have to be manually defined. In some databases, the auto-generated models will have them as integer fields (suffixed with _id). Organize your models into separate apps. Later, it will be easier to add the views, forms, and tests in the appropriate folders. Remember that running the migrations will create Django's administrative tables (django_* and auth_*) in the legacy database. In an ideal world, your auto-generated models would immediately start working, but in practice, it takes a lot of trial and error. Sometimes, the data type that Django inferred might not match your expectations. In other cases, you might want to add additional meta information such as unique_together to your model. Eventually, you should be able to see all the data that was locked inside that aging PHP application in your familiar Django admin interface. I am sure this will bring a smile to your face. Summary In this article, we looked at various techniques to understand legacy code. Reading code is often an underrated skill. But rather than reinventing the wheel, we need to judiciously reuse good working code whenever possible. Resources for Article: Further resources on this subject: So, what is Django? [article] Adding a developer with Django forms [article] Introduction to Custom Template Filters and Tags [article]

0
0
7306

Getting Started with Meteor

Packt

14 Sep 2015

6 min read

In this article, based on Marcelo Reyna's book Meteor Design Patterns, we will see that when you want to develop an application of any kind, you want to develop it fast. Why? Because the faster you develop, the better your return on investment will be (your investment is time, and the real cost is the money you could have produced with that time). There are two key ingredients ofrapid web development: compilers and patterns. Compilers will help you so that youdon’t have to type much, while patterns will increase the paceat which you solve common programming issues. Here, we will quick-start compilers and explain how they relate withMeteor, a vast but simple topic. The compiler we will be looking at isCoffeeScript. (For more resources related to this topic, see here.) CoffeeScriptfor Meteor CoffeeScript effectively replaces JavaScript. It is much faster to develop in CoffeeScript, because it simplifies the way you write functions, objects, arrays, logical statements, binding, and much more.All CoffeeScript files are saved with a .coffee extension. We will cover functions, objects, logical statements, and binding, since thisis what we will use the most. Objects and arrays CoffeeScriptgets rid of curly braces ({}), semicolons (;), and commas (,). This alone saves your fingers from repeating unnecessary strokes on the keyboard. CoffeeScript instead emphasizes on the proper use of tabbing. Tabbing will not only make your code more readable (you are probably doing it already), but also be a key factor inmaking it work. Let’s look at some examples: #COFFEESCRIPT toolbox = hammer:true flashlight:false Here, we are creating an object named toolbox that contains two keys: hammer and flashlight. The equivalent in JavaScript would be this: //JAVASCRIPT - OUTPUT var toolbox = { hammer:true, flashlight:false }; Much easier! As you can see, we have to tab to express that both the hammer and the flashlight properties are a part of toolbox. The word var is not allowed in CoffeeScript because CoffeeScript automatically applies it for you. Let’stakea look at how we would createan array: #COFFEESCRIPT drill_bits = [ “1/16 in” “5/64 in” “3/32 in” “7/64 in” ] //JAVASCRIPT – OUTPUT vardrill_bits; drill_bits = [“1/16 in”,”5/64 in”,”3/32 in”,”7/64 in”]; Here, we can see we don’t need any commas, but we do need brackets to determine that this is an array. Logical statements and operators CoffeeScript also removes a lot ofparentheses (()) in logical statements and functions. This makes the logic of the code much easier to understand at the first glance. Let’s look at an example: #COFFEESCRIPT rating = “excellent” if five_star_rating //JAVASCRIPT – OUTPUT var rating; if(five_star_rating){ rating = “excellent”; } In this example, we can clearly see thatCoffeeScript is easier to read and write.Iteffectively replaces all impliedparentheses in any logical statement. Operators such as &&, ||, and !== are replaced by words. Here is a list of the operators that you will be using the most: CoffeeScript JavaScript is === isnt !== not ! and && or || true, yes, on true false, no, off false @, this this Let's look at a slightly more complex logical statement and see how it compiles: #COFFEESCRIPT # Suppose that “this” is an object that represents a person and their physical properties if@eye_coloris “green” retina_scan = “passed” else retina_scan = “failed” //JAVASCRIPT - OUTPUT if(this.eye_color === “green”){ retina_scan = “passed”; } else { retina_scan = “failed”; } When using @eye_color to substitute for this.eye_color, notice that we do not need . Functions JavaScript has a couple of ways of creating functions. They look like this: //JAVASCRIPT //Save an anonymous function onto a variable varhello_world = function(){ console.log(“Hello World!”); } //Declare a function functionhello_world(){ console.log(“Hello World!”); } CoffeeScript uses ->instead of the function()keyword.The following example outputs a hello_world function: #COFFEESCRIPT #Create a function hello_world = -> console.log “Hello World!” //JAVASCRIPT - OUTPUT varhello_world; hello_world = function(){ returnconsole.log(“Hello World!”); } Once again, we use a tab to express the content of the function, so there is no need ofcurly braces ({}). This means that you have to make sure you have all of the logic of the function tabbed under its namespace. But what about our parameters? We can use (p1,p2) -> instead, where p1 and p2 are parameters. Let’s make our hello_world function say our name: #COFFEESCRIPT hello_world = (name) -> console.log “Hello #{name}” //JAVSCRIPT – OUTPUT varhello_world; hello_world = function(name) { returnconsole.log(“Hello “ + name); } In this example, we see how the special word function disappears and string interpolation. CoffeeScript allows the programmer to easily add logic to a string by escaping the string with #{}. Unlike JavaScript, you can also add returns and reshape the way astring looks without breaking the code. Binding In Meteor, we will often find ourselves using the properties of bindingwithin nested functions and callbacks.Function binding is very useful for these types of cases and helps avoid having to save data in additional variables. Let’s look at an example: #COFFEESCRIPT # Let’s make the context of this equal to our toolbox object # this = # hammer:true # flashlight:false # Run a method with a callback Meteor.call “use_hammer”, -> console.log this In this case, the thisobjectwill return a top-level object, such as the browser window. That's not useful at all. Let’s bind it now: #COFFEESCRIPT # Let’s make the context of this equal to our toolbox object # this = # hammer:true # flashlight:false # Run a method with a callback Meteor.call “use_hammer”, => console.log this The key difference is the use of =>instead of the expected ->sign fordefining the function. This will ensure that the callback'sthis object contains the context of the executing function. The resulting compiled script is as follows: //JAVASCRIPT Meteor.call(“use_hammer”, (function(_this) { return function() { returnConsole.log(_this); }; })(this)); CoffeeScript will improve your code and help you write codefaster. Still, itis not flawless. When you start combining functions with nested arrays, things can get complex and difficult to read, especially when functions are constructed with multiple parameters. Let’s look at an ugly query: #COFFEESCRIPT People.update sibling: $in:[“bob”,”bill”] , limit:1 -> console.log “success!” There are a few ways ofexpressing the difference between two different parameters of a function, but by far the easiest to understand. We place a comma one indentation before the next object. Go to coffeescript.org and play around with the language by clicking on the try coffeescript link. Summary We can now program faster because we have tools such as CoffeeScript, Jade, and Stylus to help us. We also seehow to use templates, helpers, and events to make our frontend work with Meteor. Resources for Article: Further resources on this subject: Why Meteor Rocks! [article] Function passing [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article]

0
0
6788

How-To Tutorials - Design Patterns

Dealing with Legacy Code

Getting Started with Meteor

Trending Topics

Create a Free Account To Continue Reading

SignIn Free Account To Continue Reading