Connections over a network often need debugging but due to the nature of the networking-beast, the actual topography of the network may be unknown. Connections are usually perceived as going from A to B, for example your desktop computer connects to Amazons website. To help expose the exact topography of a network we can use a simple tool called Trace Route (traceroute); this utility will show us the route to our destination.
As you may have gathered, networking routing is a lot more complex than simply A to B. If it was really this simple any problem is likely to occur in only one of 2 places. In reality the number of network entities between you and your destination (Amazon, Google, SMTP Servers, etc) can be a lot higher. As I type, the number of network entities between me and Amazon.com is actually over 30!
How does Traceroute works?
When your program attempts a connection, it usually is never aware of the distance or number of network entities between you and the destination. This is worried about much lower in the OSI Model, sparing your program the headache of routing tables, sessions, data segments and alike. So if a program doesn’t know how its data is routed, how does traceroute work?
Traceroute works by relying on a networking parameter called Time To Live (TTL. Everytime your data is passed from one entity to another along a network, its TTL value is decreased by 1. When the TTL gets to zero, the data is discarded and your connection times fails. FWIW; TTL has a maximum value of 255, but a usual default of 64.
Traceroute starts by trying to connect to Amazon, for example, with a TTL of 1. This means the connection can only ‘hop’ once down the network. This will time out and report the IP of the node it got to before it failed, lets say its Node #1
Traceroute then tries the connection again, this time with a TTL of 2. It’ll get to Node #1, but this time manage one extra hop. Again the IP address will be returned, but this time it’ll be for Node #2. Traceroute will continue probing each node by increasing the TTL, until the destination is reached. Each subsequent probe logs the IP address and time it takes to get to each node along the network.
3 Useful Traceroute Examples
1. Running a simple traceroute?
Using just the server name of your destination, you’ll get an output like this.
(I apologise for how the output is shown below, I really should make my blog CSS skin wider to make it easier to read these)
$ traceroute google.com traceroute to google.com (126.96.36.199), 30 hops max, 60 byte packets 1 10.0.2.2 (10.0.2.2) 0.138 ms 0.076 ms 0.135 ms 2 172.16.5.254 (172.16.5.254) 10.571 ms 10.612 ms 10.574 ms 3 172.16.54.1 (172.16.54.1) 0.786 ms 0.947 ms 0.945 ms 4 188.8.131.52 (184.108.40.206) 0.939 ms 0.906 ms 0.990 ms 5 220.127.116.11 (18.104.22.168) 10.488 ms 10.482 ms 10.628 ms 6 22.214.171.124 (126.96.36.199) 10.458 ms 188.8.131.52 (184.108.40.206) 19.394 ms 220.127.116.11 (18.104.22.168) 19.457 ms 7 22.214.171.124 (126.96.36.199) 19.414 ms 188.8.131.52 (184.108.40.206) 21.835 ms 21.843 ms 8 220.127.116.11 (18.104.22.168) 21.588 ms 22.214.171.124 (126.96.36.199) 21.681 ms 188.8.131.52 (184.108.40.206) 20.977 ms 9 220.127.116.11 (18.104.22.168) 20.776 ms 20.799 ms 22.214.171.124 (126.96.36.199) 20.578 ms 10 188.8.131.52 (184.108.40.206) 20.851 ms 20.773 ms 220.127.116.11 (18.104.22.168) 21.142 ms 11 22.214.171.124 (126.96.36.199) 20.994 ms 188.8.131.52 (184.108.40.206) 20.506 ms 220.127.116.11 (18.104.22.168) 19.818 ms 12 22.214.171.124 (126.96.36.199) 19.770 ms 19.753 ms 14748.134 ms
You can easily see along the lefthand side of the output how many networking nodes were used, in this example it took 12 hops to reach Google. The IP address of the node is beside each one
By default, traceroute makes each attempt 3 times, with the time it takes for each attempt shown after the IP address. Nodes 1-5 have their last 3 columns showing the normal output. You’ll notice though in my example that nodes 6-11 have additional IPs listed in them; this is telling us that subsequent attempts were handled by nodes on different IPs (load balancers etc).
2. What does an asterisks/star in traceroute mean?
As explained above, tracework works by sending out connections with a certain TTL. Sometimes however, instead of seeing an IP address you may see 1 or more asterisk / stars on your screen. E.g.
7 188.8.131.52 (184.108.40.206) 20.622 ms 220.127.116.11 (18.104.22.168) 20.544 ms 22.214.171.124 (126.96.36.199) 20.368 ms 8 188.8.131.52 (184.108.40.206) 20.134 ms 220.127.116.11 (18.104.22.168) 20.924 ms 22.214.171.124 (126.96.36.199) 21.047 ms 9 188.8.131.52 (184.108.40.206) 20.898 ms 20.715 ms 20.729 ms 10 * * * 11 * * * 12 220.127.116.11 (18.104.22.168) 21.294 ms 22.214.171.124 (126.96.36.199) 21.205 ms 188.8.131.52 (184.108.40.206) 21.482 ms 13 220.127.116.11 (18.104.22.168) 20.850 ms 22.214.171.124 (126.96.36.199) 21.590 ms 21.010 ms 14 188.8.131.52 (184.108.40.206) 20.703 ms 20.620 ms 220.127.116.11 (18.104.22.168) 20.450 ms
This means that nodes 10 and 11 didn’t respond correctly, at all, or within the required time. If its just one or two asterisks, its likely to be a timeout. 3 asterisks, as seen here, is fairly good proof that nodes 10 and 11 don’t respond to the standard traceroute probe you’re sending out. You can change the mode of traceroute, but its beyond the scope of this article. Have a look at the Linux man page for traceroute if you’d like to find out more.
3. Change the default TTL in traceroute
In our above explanation of how traceroute works, we said the Time To Live starts at 1. This means that the first few nodes will always be your immediate internal network, before the probes can be routed to externally. These nodes although equally as important to the transportation of your data, may be less useful to you being reported everytime.
By using the ‘-f’ argument, you can specify for traceroute to start its TTL at a higher value.
This becomes really useful when you have particularly slow node(s) within the routing to your destination. If node 10 was performing really slowly, you could use the following example to start the probe just after node 10.
$ traceroute google.com -f 11 traceroute to google.com (22.214.171.124), 30 hops max, 60 byte packets 11 126.96.36.199 (188.8.131.52) 9.611 ms 9.478 ms 184.108.40.206 (220.127.116.11) 9.554 ms 12 18.104.22.168 (22.214.171.124) 10.083 ms 9.282 ms 9.357 ms