Debugging Rails with Operating System Tools

Unix/Command-Line level tools for making web-developers' lives easier– lsof, nslookup, gdb, tcpdump, ctags, ack, strace, and top

Part 4 in the series A Comprehensive Guide To Debugging Rails {: .series-highlight}

Operating System Level Mirrors

  • Domain name/DNS info: View and debug DNS/IP issues using nslookup. To see the mail records (MX) for a domain type $ nslookup -type=mx oxbridgenotes.co.uk (note that you need to drop any possible "www" in front of the domain here). Look up the IP addresses associated with a domain using $ nslookup www.oxbridgenotes.co.uk. Find out how long a DNS record will be cached for with $ nslookup -debug www.oxbridgenotes.co.uk. See all possible nameservers with $ nslookup -type=ns www.oxbridgenotes.co.uk. Knowing the possible nameservers from which to query IP address and other information enables you to query the result of one particular nameserver using $ nslookup www.oxbridgenotes.co.uk ns1.redhat.com (where ns1.redhat.com was returned as a possible nameserver using type=ns). Why might you want to do this? To quickly verify that changes you made to your DNS records were carried out successfully without waiting a lifetime for the records to propagate across the internet.

  • List of open files and sockets: Figure out what files or sockets your Rails server is at that second interacting with by typing $ lsof -p <PID>, where PID is the process id of the Rails application process. Useful in combination with dtruss when you'd like to map out the file-descriptor arguments to syscalls to the names of the files/sockets in question.

  • curl: Sometimes you cannot easily test a particular URL from within the browser. In these cases curl comes to the rescue. Request the localhost:3000/product_list URL from the command line with $ curl http://localhost:3000/product_list. By default curl performs a GET request, but can also be set to use other HTTP verbs such as POST/PUT with the "X" flag. For example $ curl -X POST http://localhost:3000/to_txt -F file=@/Users/jkinsella/Code/oxnotes-docservice/fixtures/length.pdf posts a form ("F flag") that includes a local file (preceded by "@"). Curl is particularly useful for making requests with custom headers, specified using the "H" flag. For example $ curl localhost:3000/product_list -H "Content-Type:image/gif". Another common use-case is to exclusively return the HTML headers from a request without waiting for the whole body of the request. Use the I option, e.g. curl -I www.google.com

  • Process Listing: Whenever your Rails application starts (server, console, Rake, whatever) it creates an operating system level process. You can search for this process using $ ps aux | grep rails. The output provides information about the process ID, the state/health of that process, and the memory/CPU resources it consumes. For detailed information on interpreting this output refer to man ps. Watch out for situations when you see more Rails processes output than you expect, as this might indicate processes that are hidden/daemons/still running despite not having a terminal tab open and attached the them. Process listing is your best friend in debugging situations where Rails interacts with another program on your operating system. For example many Rails websites use the separate database program postgres, and when things aren't working with your database, a great first step is to check whether the postgres process is running correctly.

  • C Debugger (gdb): (This section assumes you are using the standard Ruby install that is compiled in C, as opposed to MacRuby, JRuby etc.) The C debugger is used to debug problems a Ruby debugger will have difficulty reaching, notably infinite loop bugs. (You'll notice an infinite loop bug because the program seemingly freezes without exiting or spitting out a stack trace.) or in situations where you have a sporadic bug that may disappear on restart and you would rather find out whats wrong now. In getting the gdb to work, the order in which you enter commands is important. First, start your Rails process (server / console / whatever) as normal. Second, execute the request/console command that causes the infinite looping problem. Wait until the Ruby process is stalled then open up a new terminal tab and activate the gdb. Typically you start it with $ sudo gdb $(which ruby) 67563, where the first argument dynamically figures out the Ruby compiler you are using (a necessary argument to gdb), and the second argument is the process id (pid) of the Ruby process running your Rails application. (Figure out what this pid is by running ps aux | grep ruby). Now that gdb is attached to your Rails process you can force a Ruby backtrace with call (void)rb_backtrace(). This backtrace will not appear in your Rails server process terminal tab like it would when you are interacting your Rails software normally through URL requests; instead the backtrace will be appended to the log/development.log file, which you monitor with $ tail -f log/development.log. The backtrace forced through at the C level provides an otherwise unavailable insight into why and where the program stalled. Gdb also provides the possibility of debugging at C level, but that's beyond the scope of this article.

Installation woes: On Mavericks you'll need to install gdb with Homebrew then codesign it by following this guide.

  • tcpdump: The lowest common denominator across unix-like systems for dumping traffic on a network. Say you want to monitor all packet traffic originating from the domain oxbridgenotes.com and interacting with some daemon program or opaque command line script. In this situation you can't simply open up Google Chrome DevTools Network Tab and monitor the request in this way. Instead you must rely on command line tools, and here the command in question is: $ sudo tcpdump -s 0 -A host "www.oxbridgenotes.co.uk". The s 0 option specifies to remove the limits on packet size normally present. The A options says print out ASCII renditions of the packets. Tcpdumps output shows the source IP address and the destination IP address of each packet alongside what was contained within, enabling you to ensure that correct information was sent from the right places at this level. On other occasions, such as when debugging web-sockets, you might want to monitor traffic on a particular port: $ sudo tcpdump port 8000

  • Environmental variables: A number of variables set outside the Rails software, i.e. at the OS level, can influence either our code or the third party libraries we use. You'll know that these variables are in use whenever the codebase contains a reference to the ENV global variable, for example ENV["LOG_LEVEL"] is Ruby's way of accessing the environmental variable LOG_LEVEL. View all the environmental variables on production using $ h config. In development we dump the Heroku environmental variables into the mirrored .env file with $ heroku config > .env, and then these variables become available to our software thanks to the Foreman gem. System level environmental variables may also be set, most typically by user modifications to the ~/.bashrc file. You can view the contents of these environmental variables with echo. For example view the contents of the LOG_LEVEL environmental variable with $ echo $LOG_LEVEL(notice the dollar sign necessarily appended to the variable name). To list all variables and their values type: $ printenv. Access the same listing within your Rails console by typing ENV.

  • Temporary Files: Our application uses the tmp/ folder to store temporary files, such the Word documents we temporarily download before zipping into larger packages and uploading to s3 for storage. Inspect the contents of this folder to debug issues such as whether the algorithm is downloading files correctly. It's useful to employ the watch command to periodically run another sub command (here ls) on the tmp folder, for example $ watch -n 10 ls tmp/seller-invoices checks the contents of the tmp/seller-invoices folder every ten seconds, updating the screen with the contents, enabling you to see changes as they happen.

  • which: A godsend for figuring out whether there are multiple installations of command line programs, be that our own or those installed by gems or homebrew, and for ascertaining whether invoking that program via command line uses the version we intended. $ which -a postgres reveals all the locations of postgresql binaries currently on the $PATH, with the binary called whenever you type the unqualified command (e.g. $ postgres) appearing first in this list. Change the order of the commands output by which by modifying the $PATH variable in your ~/.bashrc file.

  • Exuberant Ctags: Used in conjunction with Vim through the EasyTags plugin, Exuberant Ctags lets you quickly jump to the definition of any function within your codebase. Force a project level tag refresh with :UpdateTags -R ~/Code/oxbridgenotes/. Enter Vim and jump to the function generate_products by typing :ta generate_products. The :ta command also accepts regular expressions, meaning you can find all functions starting with "generate" using :ta /generate/*. View the entire list of matches with :ts(elect) (bracketed letters are optional in invoking the Vim function) and move forwards one tag with :tn(ext) and backward with :tp(rev). By default Ctags show all the tags within your Rails project directory, leaving out that of the gems. It's more helpful to extend tagging to include the codebase of all your gems by using this command $ ctags -e --exclude=.git --exclude='*.log' -R *bundle show --paths. Now you can type in the name of any function within any of the gems listed in your Gemfile` and jump to the source code- an absolutely enormous time-saver.

  • history grepping: Search previous command line commands. For example I can search all curl commands with $ history | grep curl, and use this to refresh my memory about a particular set of flags I used to solve a problem in the past. I set my history to hold 20000 lines with export HISTSIZE=20000 added to my ~/.bashrc, meaning that I effectively keep my history of commands as a reference for years.

  • ack for real world usage examples: Sometimes legacy functions are nicely documentedbe that in your project documentation, the unit tests validating that function, or the in-code comments above the function. More often though, these functions arent documented at all and it may not be apparent how they should be used. The very fact that a legacy function has already been used elsewhere in your project leads to implicit usage documentation. For example you find the generate_product(subject, price) function and what data type price is supposed to be a string, Integer, Float, or something else? Type $ ack generate_product and amongst the results you may find clues of prior usage, such as generate_product(subject, 12.33), indicating that you should use a float. A variation of this technique is to refer to the values historically saved within a database column by printing out all the unique values ever saved and then inferring what format kind of format the column expects. For example Product.select(pricing_category).uniq returns all the pricing_categories saved within the database, informing you that it's a one letter upper-case string code.

  • dtruss (OS X) / strace (linux): Monitor a) which operating system level calls (syscalls) your program makes, with what arguments, and how long each call takes to execute; and b) the OS signals a process receives. Dtruss spits out a fire a great deal of data, but is nevertheless useful in situations where you can't afford to stop the running program (e.g. in production) or when you have a hunch that the problematic program is bug-free but youre just using it wrong (e.g. your Unicorn server; init processes; anything involving configuration files that should appear in a specific location). Dtruss works programs ultimately rely on the operating system to do work such as opening files, forking new processes or printing output to screen, and these operations provide traceable clues as to the functioningor malfunctioningof your program. When using OS X you only have dtruss available to you. Begin with $ sudo dtruss -p PID_OF_YOUR_RUBY_PROCESS -f. The f option specifies that youd also like to follow child processes, usually a sensible decision in understanding the scope of your program. The a option asks dtruss to print all information, including elapsed time and the process in question, a welcome counterpart to the f option. As your Ruby process runs you'll see a stream of operating system calls. Youre probably confused about what each of these functions meansI quote author of the Passionate Programmer, Chad Fowler on his experience with learning strace Neither did I at some point. I learned most of this stuff by tracing broken programs and reading man pages. For a full tutorial refer to Chads excellent piece.

  • man search: You probably know that you can type $ man ANY_COMMAND_LINE_OR_SYSTEM_COMMAND to display its documentation in the terminal. A complementary time-saver that I only learned about much later was that you quickly search for keywords within man by typing forward slash followed by your keyword, e.g. /directory. Move forwards and backwards through the highlighted matching results with N or n.

  • top: Displays constantly updated, sorted information about resource usage of each process. For example $ top -o cpu, shows the currently running processes in order of their CPU usage. (the o option stands for order). Top is useful to identify whether a stalled processes is statically stalled (e.g. waiting for a resource to be released/sleeping) since this would have close to 0% CPU usage, or is dynamically stall (e.g. infinite loop) since this often has heavy CPU usage.


More Articles: Click here for full archive

Debugging Rails with Memcached

Many apps use Memcached in their infrastructure so it's important to know how to debug thorny caching errors.


Email Marketing

The relevant laws, some growth-hackery tips, and suggestions about good emails to send


Dealing with Test Leakage

Strategies for pinpointing and extricating indeterminacy from software tests.