Web Services

Web Services was another topic that I coudln’t find a good overview / introductory article for. Considering that it’s such a huge component of the web, I figured I’d give a written version of my Web Services presentation so that future generations can benefit from my knowledge :P

So, Web Services. Back in the day, there was the static web. This was the static document stuff I was talking about in my last article, where the internet was basically made up of documents that linked to one another. Which was great. But when it came to programatically accessing those documents, it was a little rough – if you ask a server for information, they send you a full HTML document which you then have to parse information out of. And as soon as that HTML document structure changes (which can happen pretty often with site updates and design changes), your parser code probably won’t work anymore.

Enter Web Services. Web Services provide a way for people to send and/or receive data from a web server using a standard protocol (HTTP). This is huge. Before this there were network services, but they were mostly vendor or application-specific ways of transferring data. If you didn’t know exactly how the server software you were talking to worked, it would be pretty near impossible to get anything useful from it.

Web services abstract away all that proprietary information; you don’t need to know anything about the server platform, object model, programming language, or any implementation detail to use the service – you just need to know how to use HTTP.

Of course it also addresses the original concern – all you’re sending back and forth is data, not HTML markup. So you can expect your data in a specific format (usually XML or JSON), and you don’t have to do any extra work other than serializing the data in order to access it.

There’s 2 general kinds of web services: Remote Procedure Calls, and Web API.

RPC

The idea of a remote procedure call is that you’re executing a method on an object that lives on a web server, as opposed to something that lives in memory. This is primarily done using the Simple Object Access Protocol (SOAP). SOAP is basically a structured XML document that is sent back and forth representing the object, method you’re calling, parameters you’re passing, and its return value.

For the most part, a lot of programming languages have tools to make your code look indistinguishable from a normal service call and abstract away all the serialization/deserialization that needs to happen. They do this by using a Web Service Definition Language (WSDL) document published by the web serivce. A good example of this is the “Add Service Reference” project option in Visual Studio – you point it at a web service, and it generates a “proxy” object for you which you can then construct and execute methods on like any normal object.

Behind the scenes, that proxy object is created using the WSDL file that is downloaded from the server. The WSDL lists all the methods available from the web service, as well as the expected input and return types for each method. Using this information, a code generation tool makes a class definition that has all those methods and expected paramters/return types. The body of those proxy methods actually creates the XML document for you, and also manages sending and receiving the data from the web service. Imagine if you had to do all of that yourself!

There’s a lot of overhead in RPC-style web services. A lot of XML is transferred back and forth, which (although auto-generated) gets big and cumbersome. Not only that, but you get bound pretty tightly to a very specific version of the server’s WSDL document – if anything changes (even if they upgrade their libraries!) your client code may not work.

Web API

A few smart people thought that there must be a way of slimming the RPC overhead down, and they created the concept of Web API services. Web API services mostly differ in that they take advantage of the HTTP standard more than a normal SOAP endpoint does. Instead of having to execute a very specific method on a very specific endpoint, you use simple URLs and GET/POST variables to send and receive data.

This is a lot more lightweight because you no longer have to follow the strict guidelines of the SOAP protocol. To talk to a specific resource, you just need to know its URL, instead of having to download its WSDL and figure out what the SOAP envelope is supposed to look like. To get a list of search results from Twitter, I just need to send a standard HTTP request to search.twitter.com/search.json?q=test

Also instead of sending a sizable SOAP XML document, you can send your variables using the standard HTTP =/& format (i.e. CustomerID=5&Name=Chad). And because this is all part of the HTTP standard, odds are really good that your programming language already has a way of accessing it using a standard web client. Because of the reliance of standards for data access and updating, it’s much easier to consume and combine these web services. In fact, that was one of the main goals of Web API services – to enable “mashup” scenarios.

REST

So you’ve probably heard the term REST at some point in the last 3 years or so. REpresentational State Transfer is kind of an offshute of Web API, and provides a set of guidelines for implementing Web API services. There are quite a few guidelines, but some of the commonly-implemented ones include:

  • using specific HTTP verbs for data operations (Create = PUT, Read = GET, Update = POST/PATCH, Delete = DELETE),
  • a focus on stateless resources (every individual call to a REST web service can be made regardless of the calls before or after it), and
  • clean resource URLs (/customers/5/invoices instead of /customers?id=5&show=invoices).

I could write an entire post on REST and it’s current adoption, so I’ll save the rest for later.

All in all Web Services are a pretty fundamental part of the web and data sharing going forward. Tonnes of companies offer web services, and quite a few large companies base their entire business model off of it. Odds are if you need some data, there’s an API for it – just search through Google, or check out the list of APIs on ProgrammableWeb

Comments