Homepage

Troubleshooting Network Connectivity (3)

June 7th, 2008

The TCP/IP network that most people use today is not really a plug-n-play thing.  It needs proper configuration.  Typically I follow these steps to figure out if the network configuration is correct or not.

  1. Do I have an IP address?
  2. Do I have a working name service?
  3. Can I reach the machine that I want to connect to?

The most basic and important part of network configuration is to get a valid IP address.  One can simply use the ipconfig tool to tell if there is a valid IP address (i.e. something other than 0.0.0.0 or 169.254.*.*).  We can further drill down the possible reasons of not having a valid IP address

  1. No actual network link is established (typically happens in wireless network)
  2. DHCP down
  3. Misconfiguration

Wireless devices today typically negotiates well with the wireless router.  However, if you want the wireless link to be secure, you’ll need to enable WPA or WPA2, which requires setting up passwords to connect to the router.  I find that the problems often come from the tools that shipped with wireless routers/adapters to “help” user setup these configurations.  From my personal experience/bias, these tools creates more problems than the problems that they intend to solve.  Here are some rules of thumb that I’d suggest

  • Do not install additional programs unless absolutely necessary
  • If there is a Windows build-in device driver for the wireless adapter, and there is no strong evidence saying that vendor-provided drivers perform better, use the Windows one.  It’s not a 3D-graphics card and there’s typically little performance difference between Microsoft and vendor drivers.  More than often, you don’t really need the add-on features provided by vendor drivers.
  • If you can install device driver through Device Manager in Windows, do it that way.  I’ve seen too many ill-written setup programs of wireless adapters that do lousy jobs and screw the registry.
  • IMO the best way to do WPA configurations is getting them done through the web interface of router and through the Windows Control Panel.  Remember to write down the passwords and settings.

The criteria that we have an actual network link to the router is that you can ping the router (say, ping 192.168.1.1 and found it’s responsive).  Remember to turn off your firewall program when you ping because some firewall can block the ping (ICMP) traffic.

DHCP is a nice tool to have, and most routers today have built-in DHCP server.  However, if you don’t have a IP address, and the network link is okay (i.e. you can ping the router), I’d then suspect DHCP server is down.  A router can still be operational even though its DHCP server is not working, so power cycling the router is typically a worth try.  (Just think that the router is a small computer running programs, and DHCP server is one of these programs so it could be dead while other programs running normally.)  Again, you’ll need to disable the firewall, do “ipconfig /renew”, and see if you can acquire an IP address from the DHCP server.

One problem that I often see is that people have statically configured their network in the beginning (with some guru’s help), and then purchased new equipments and run the auto-configuration tool.  There’s little chance that the auto-configuration tool will respect existing network configuraions and setup accordingly.  As a result, you might have some computers configured as 192.168.1.* while others as 192.168.2.*, which by no means will work.

The other typical misconfiguration is that people have two routers and try to setup them in a single network segment (i.e. an old wired router as 192.168.1.1 and a new wireless router as 192.168.1.2).  You can not have two routers in a single segment.  If the numbers of IP address really matter, we can play netmask tricks and setup static routes so every device has a 192.168.1.* address, that’s one way to do it.  The other way is to downgrade the wireless router to AP or the wired router to hub, so there’s only one router in charge.

 

Technical, Troubleshooting, Windows | Comments Off Jump to the top of this page

Troubleshooting Network Connectivity (2)

May 31st, 2008

We can roughly classify hardware-related issues into following categories:

  • heat dissipation issues
  • cabling and contact issues
  • equipment hardware failure

People often throw their routers, hubs, switches at the dark corner beneath the desk along with messy cables so that they are hidden from line of sight. If the room is warm, these poor little things have great chances to stop working because of heat dissipation problems: they generate heat and they need proper ventilation!

Cabling issues are quite common and getting worse thanks to the trend of manufacturing cost-down.  Personally I put a question mark on all Ethernet cables unless they passed inspections done by Fluke.  Well, Fluke is an expensive toy for geeks and not for average Joe.  The other way is to find replacement cables and do some cable swapping.  Classic probability tells us that you will have less chance encountering a defective cable if the defect rate of cables is constant and you tried many cables.

The most frustrating issues are contact issues.  Many times we think that we already have things plugged firmly but they actuallly are not.  It takes some extra minutes to double check all plugs but that might save you hours later.

Equipment hardware failures are not rare.  Again, thanks to the magic of cost-down, you simply can’t expect too much for the dirt cheap routers/switches/hubs/cables today.  However, we should rule out all configuration and software issues before running into the always tedious RMA process (if any).

Technical, Troubleshooting, Windows | Comments Off Jump to the top of this page

Troubleshooting Network Connectivity (1)

May 26th, 2008

Part of my job is to help people troubleshooting network connectivity issues.  As you all know, network today is not really a plug-and-play thing, at least not as easy as setting up a toaster.  So, when we have a network issue, what or where should we look at?  Here are summary of the steps I typically use and I’ll elaborate these items in this series of articles.

  • Rule out stupid issues
  • Rule out hardware issues
  • Rule out configuration issues
  • Rule out software issues

Rule out stupid issues
Here are some of my favorite stupid issues. Make sure they are checked before any investigation :)

  1. Do we have everything powered on?
  2. Do we have everything plugged in firmly and correctly?
    If there are young children around the house, double check these two items and make sure they do not have access to the working area!
  3. Did I pay my internet/phone bill?
  4. Are we free of Faraday cages?
    Per my own experience, this is surprisingly common.  People often disassembled their computer and put the case panels nearby their own wireless router, and form a perfect Faraday cage preventing themselves from using wireless.
  5. Are batteries charged and installed correctly, if any?

If all answers to these quick questions are yes, we can now move on to the next steps.

Technical, Troubleshooting, Windows | Comments Off Jump to the top of this page

Things Learned from Performance Tuning

February 10th, 2008

One major blocker for a software project to finish is unsatisfactory performance.  Engineers always are asked to tune the performance to customers’ satisfaction no matter what.  Performance tuning itself is a very interesting topic and I’d like to share some interesting things that I’ve encountered before.

Two main performance tuning targets are reducing execution time or shortening waiting time.  Reducing execution time is typically related to back-end applications.  I had involved in a banking software project which had a 19.5-hour batch.  Well, you know this is way too absurd to be acceptable.  After continuous tuning for a month, the batch later took 19 minutes to complete.  It might not be common to see such drastic results, however, I did learn something very interesting:

  • The table design of a database is very important.  Most of the (banking) projects that I’ve involved before design their database based on existing table structures from the customers.  Unfortunately, theses table structures are not likely well optimized (at least for the project you are working on).  In the very beginning of the project, we need to identify the most frequent or critical query path, and design the table structures accordingly.
  • The golden rule of store procedure selection is to make sure the first selected data set is the smaller one.  When a cursor points to a smaller data set, the loop count will be fewer and performance is typically better.  The example I gave above improved from 19.5 hours to 6 only by switching the order of two nested select statement.
  • As a team leader, you need to have real assessment for your team members’ capabilities.  Some people might consider this as micro-management, however, I found it a super common scenario.  One good way to have an idea about that is to talk to your team members and see what he/she plans to deal with the issues.  If there’s no plan, or the execution/executability of the plan is questionable, you’ll need a plan B.
  • Customers and PMs usually are trapped in the myth of scalability: they demand things that have far more capacity than needed, or are super flexible that can perform all kinds of ridiculous magic.  It’s important to bring them down to the earth and get the job done.
  • Most frustrations come from the issue of “real requirements”.  In many cases, the performance issue is just a product of miserable management, poor communications, political complexity, or the combination of these craps.  For example, if two VPs of your customer’s company ask for technically conflicting features, what will you do?  One way is to wait another five years for Moore’s Law to defeat them by raw computation power.  Sometimes we just need little help from very higher ups to get the deal settled.
  • Many people learned the idea that improvement of algorithm is more efficient than optimizing the code.  However, this idea is based on the assumption that the original algorithm is implemented with quality code.  Please make sure the code is quality code before wasting time to figure out  better faster fancier algorithms.  Lame code can screw the best algorithm on the planet without any problem.
  • If you work with a gigantic team, it is almost fated that the product of your team will be slow due to the nature of bureaucracy.  It’s purely a management problem and need courage and luck to overcome.

Shortening waiting time typically is not related to technical issues, unless the design is bad from the very beginning, or the customer’s network infrastructure is not capable.  Most solutions involve psychological effects, or perceptual performance per se.  For example, I’ve encountered that my customer complains our report generating program is too slow, because it takes 5 minutes to generate a 50-page report.  Okay.  Later I give them another version, with a dialog showing a progress bar and flashing text regarding to the progress of report generation, and the customer is very happy about it.  Guess what?  The new program takes 7 minutes to generate the same 50-page report :)  Well, the trick is very simple: user feel that things are “moving” if they see a responsive UI instead of a stalled one.

By the way, in my previous article about the myth of memory usage, we see another form of perceptual performance although it is not related to shortening waiting time.  Tuning the perceptual performance could be tricky and need a lot of work in usability researches.  If you do not have a usability guy covering your back, it’s time to have more chats with the customers and try to figure out what they think.

Uncategorized | Comments Off Jump to the top of this page

The Relationship between Memory Size and Performance

January 29th, 2008

One can easily spot an application’s memory usage in Windows using Task Manager.  However, the numbers shown are often misintepreted and a frustration source of mine when I am bugged by some “professionals” with perfect misunderstandings.

Memory usage has a formal name called “working set size”.  A working set is the physical memory allocated by Windows Virtual Memory Manager (VMM) for a process.  As a result, if one reads the book Windows Internals and stops here, he will start complaining a program is “fatty and slow” when he sees big working set.  Let’s read a bit further before convicting anything.  A working set contains two parts: private working set and shared working set.  Private working set is used only by the owning process, while shared working set is shared among processes.  As a result, we should use private working set size as the standard of judgement, which one can use tools like Process Explorer to inspect.

However, the size of private working set could be deceiving.  Windows SDK provides an API named SetProcessWorkingSetSize, which allows developers to force Windows VMM to swap all pagable pages, and eventually lower the number of private working set.  The pitfall about the relationship between memory size and performance is that people believe smaller memory footprint implies faster execution and better performance.  This is an over-simplified corollary.  The actual performance killer is page fault, not memory usage.  Programs with higher memory usage have higher probability of page faults.  A program runs the best if all memory it demands is loaded, even if the size is huge.  Many programs try to satisfy the “perceptual performance” based on this common pitfall and use SetProcessWorkingSetSize to beautify the numbers.  From real performance point of view, this is actually not optimal because the use of SetProcessWorkingSetSize invites page faults.

Working set size is a good reference but it is not directly related to the real performance of an application.  The real killer is page fault, especially when it is caused by memory leaks.  Developers care about working set size because it’s the perfect indicator of memory leaks.  Moreover, page fault is just one factor that affects performance, and there are many others to explore and tune.

Windows itself actually call SetWorkingSetSize for programs, provided that the target program has a window and the window is minimized.  I guess this is based on a pessimistic philosophy that most programs do not free unused pages right away, so Windows does it for them to boost performance.  This interesting behavior actually introduces a very counter-intuitive phenomenon.  When people found that the computer is slow, their first reaction is typically minimize all the windows on the desktop except the one they are working on.  Doing so forces the SetWorkingSetSize calls and incurs more swapping, which slows down the system further.

Technical, Windows | Comments Off Jump to the top of this page

UI Development (3)

January 20th, 2008

For enterprise applications, web-based architecture has great potential and advantages, especially in deployment and patching.  For web-based applications, IT people no longer need to install computer by computer (theorectically), and having fewer applications installed on a PC implies stabler system.  Web-based applications may also lower the hardware requirements for client-side PCs because most tasks are performed in the server-side.

However, web-based applications also have several constraints: requirement for network infrastructure and bandwidth, UI capability limitations, report generation and printing, and security concerns (on browsers and on servers).  One important reason for technologies like AJAX becoming so popular is that these dynamic page rendering techniques effectively saved bandwidth, which means lower cost and better utilization of network.  For report generation and printing, all successful cases that I know are based on PDF or Word file generation.  As to security concerns, most SI vendors are supposed to know what they need to be aware of, however, it usually depends on how much resources will actually be invested, which is out of this article’s scope.

Web application UIs have several challenges, such as JavaScript, browser compatibility/limitation, and lack of UI elements.  JavaScript is a very loose language and most browsers run these scripts based on intepreters with not-so-great performance.  For example, if you’ve used Siebel’s portal, you share the pain of a snail-paced UI.  AJAX could somehow deceive the slowness provided that the UI can not be too complex (and that’s why Google’s interfaces are always so “clean”).  Browser compatibilty and limitation issues still are the pains in web developers’ rears, such as supporting all different flavors of IE and Firefox.

HTML specifications offer very few UI elements.  If the control that you want to use is not provided in HTML form, you are pretty much on your own to struggle the way out.  The types of elements are even fewer than what VB 1.0 provided!  What’s even better is that event handling of UI elements is a known headache generator.  Fortunately, there are so many developers working out their ways and most of the problems can be answered by either Google search or mighty Uncle Ben.  (Well, if a question can be answered by Uncle Ben, the real problem is profit-losing, not the question itself.)

Web-based application development is mature today and there are many handy IDEs and tools to use.  It is still actively evolving and we can see new tools and buzz words popping out in great velocity.  One thing that we can be sure is the diversity of tools and platforms for web-based applications will remain the same, because everyone is supposed to learn from the history of Microsoft dominance.  Well, this is just a theory.  We saw people repeating other people’s path of failure before.  I just assume people today are smarter.

Reviews & Comments, Technical | Comments Off Jump to the top of this page

UI Development (2)

January 17th, 2008

For the sake of memory leak, I couldn’t tell the exact time but it should be around VB 5.0 timeframe, there was a “new trend” in UI development, the HTML.  Desktop applications embedded HTML engine to render their UI.  The reasons that I could remember were:

  • It feels cool!
  • It’s much easier to deal with i18n problems with HTML.
  • The look-and-feel of the UI is no longer constrainted by Windows Control, and sometimes it’s easier than doing full customized GDI drawing by playing GIF tricks in HTML.
  • One can seperate presentation from business logic in a cleaner and easier to understand ways.

The most popular way to do embedded HTML UI (till today) is to embed Internet Explorer via IWebBrowser2 COM interface.  There are some additional benefits to do so:

  • One can penertrate firewall and transmit data with backend servers by utilizing existing HTTP/HTTPS functionality built in IE.
  • By creating a desktop application, one can do something that can not be (easily) done by pure browser-based thin clients, such as drag-and-drop or fine grained printer control.

However, as we all know, there are many drawbacks to embed IE in the application, such as carrying over the vulnerabilities of IE, memory/resource consumption, crappy error handling in JavaScript, … etc.  Many ISVs still choose to use this technique regardlessly because the benefits of using it are too huge to trade-off. 

By the way, IE (or should I say, Microsoft) is not interested in supporting non-XML-based CSS.  As a result, one is forced to use alternative ways to achieve things that could be done easily with CSS, which increases the complexity and maintenance cost of the program.  Did I mention that JavaScript is super slow and super hard to debug?

There are other choices of HTML rendering engines, such as Firefox Gecko, or HTMLayout.  Both engines provide better support for CSS.  Maybe we will see more applications that are powered by Gecko in the future.

Reviews & Comments, Technical, Windows | Comments Off Jump to the top of this page

Arthur’s “always-under-construction” blog

Archives

Blogroll

Meta