Saturday, September 27, 2008

Properties - A False Sense of Encapsulation

I'm having more and more mixed feelings about properties in .NET, or accessor methods in general, as properties compile to get_xxx/set_xxx methods behind the covers. I've never thought about it that much, but now they seem to have become a warning flag of bad design.

Everyone knows by now that public fields are evil. I'm not going to elaborate on that anymore. However, I've come to believe that properties are not the answer either. When a class has a particular number of private fields, then the general malpractice is to also include corresponding properties for these private fields. Whether these properties are read-only or include both getters and setters is not important. The main point is that this habit should be questioned. To me, the use of properties should be applied with great care.

But first, let me make a clear distinction between objects and data structures. An object hides its data and exposes methods to operate on that data. These methods provide the behavior of an object. A data structure, also known as a Data Transfer Object (DTO), only exposes its data and has no methods. They can be seen as data carriers that have no behavior whatsoever. Objects are best used in domain models, while DTO's, as their names imply, are best used for transferring data from one layer to another.

Why this distinction? For starters, because I know a lot of people who talk about DTO's as if they were full blown objects. They use DTO's as their primary domain inhabitants which leads to the Anemic Domain Model anti-pattern. And secondly, because the use of properties mentioned in this post obviously applies to objects and not DTO's.

As I already stated in the definition above, objects hide their data. Relentlessly exposing this data through properties breaks encapsulation of these objects. You might say that this isn't the case because properties 'hide' the private member variables of the class. But properties are just another intricate way of exposing private data.

Let's go through some examples of bad code, shall we.

Inconsistent objects

Suppose we have a class named Company in our domain. Something we see very often is something like the following:

public class Company { private String _city; private String _country; public String City { get { return _city; } set { _city = value; } } public String Country { get { return _country; } set { _country = value; } } }

Suppose we have the requirement that the user of our system can change the location of our company in case it spontaneously settles itself in a different country. The calling code would look something like this:

company.City = "Erps-Kwerps"; company.Country = "Belgium";

As you can see, this operation is clearly not atomic. It allows you to have inconsistent objects. Suppose the location was previously Hong Kong, China. The object becomes inconsistent after having the City property modified to 'Erps-Kwerps', which definitely a long way from China.

The better approach would be the following:

public class Company { private String _city; private String _country; public void ChangeLocationTo(String city, String country) { _city = city; _country = country; } }

Removing the properties guarantees encapsulation. Providing the ChangeLocationTo method not only makes the requirements explicit in code, but it also ensures that the operation is performed atomically.

The calling code gets really easy now:

company.ChangeLocation("Erps-Kwerps", "Belgium");

The most important thing here is that code that belongs to the Company class has now moved to where it belongs.

Procedural code

This is probably the anti-pattern I see he most: procedural programming in an OO language aka Cobol in C#. Behold the following application service:

public class BeerService { public void OrderDrink(Int64 customerId) { Customer customer = Repository<Customer>.Get(customerId); if(customer.Age > 16) { // Cheers! } } } public class Customer { private Int32 _age; public Int32 Age { get { return _age; } set { _age = value; } } }

This very simple example is quite obvious. The logic in the application service extracts the data from the object. The OO way would be to put the logic where the data is, namely the Customer class:

public class Customer { private Int32 _age; public Boolean IsOlderThanMinimumAge() { return customer.Age > 16; } public Boolean CanHaveBeer { if(IsOlderThanMinimumAge()) { return true; } return false; } }

See, no more properties and most importantly, we now adhere to Tell, Don't Ask principle as described in The Pragmatic Programmer. We don't ask for the data, we tell the object that has the data to do the work for us.

Hello Real World

I know these examples are very simplistic and that real-life isn't always as shiny. There are a lot of cases where adding properties is the right thing to do. The first thing that comes to mind are unit tests on domain objects. Suppose I write a unit test that calls a particular method on an object. The unit test probably wants to verify the state (data) of the domain object. I've seen solutions from some developers who let the domain object handle the assertions, which implies that the assemblies of the unit test framework of choice would be deployed in the production environment.

Another problematic scenario would be a mapping class, which is responsible for mapping the data of a domain object to a DTO. One could argue that this could be done by the domain object itself, but what if there are multiple DTO's (different views) for the same domain object. Also smells like SRP violation to me. Another option would be to use reflection.

A while ago, I was exploring the code of the TimeAndMoney DDD example (yes, Java code). While I was reading the code of the Money class, I found this intention-revealing approach for adding getters to a class: (Mayday, Mayday, Java code coming up ;-) )

/** * How best to handle access to the internals? It is needed for * database mapping, UI presentation, and perhaps a few other * uses. Yet giving public access invites people to do the * real work of the Money object elsewhere. * Here is an experimental approach, giving access with a * warning label of sorts. Let us know how you like it. */ public BigDecimal breachEncapsulationOfAmount() { return amount; } public Currency breachEncapsulationOfCurrency() { return currency; }

Maybe role interfaces described by Martin Fowler and popularized by Udi Dahan can provide a clean solution here. A domain object that implements two types of interfaces:

  • interfaces containing nothing but property getters (query).
  • interfaces containing a single method (command).

If you, my dear reader, have any thoughts about this, I'm glad to hear them.

Conclusion

As always, there are two camps here: on the left are the people who don't believe that properties violate encapsulation and on the right are the people who despise properties and never ever use them in their code. I'm currently sitting somewhere in between (I keep seeing gray ;-)). I do believe that we should quit with the habit of providing properties for private member variables by default. We should only provide access to private data when there is no other clean alternative. As always, being dogmatic is not a good thing. Reducing visibility of data as much as possible leads to more reliable and maintainable code. We should beware of code that calls more than a single method on the same object. Properties are not evil, but in a lot of cases inappropriate. We should be able to resist to this temptation.

To round of this long post, let me just point you to an article I found on Java World (yes, again Java) written in 2003 by Allen Holub, named Why getter and setter methods are evil (don't ask how I found it) which explains the case I'm trying to make with this post in a much clearer way.

Update 09/27/2008: It seems that this topic is also being discussed on the ALT.NET user group.

Wednesday, September 24, 2008

Little Secret

I'll let you in on a little secret. The key to writing good comments is ... (rolling the drums) ... not writing them at all! Let me elaborate on that.

To me, there are two kinds of comments:

  • The ones who appear right in the guts of a particular method.
  • The ones who appear right above a particular method, mostly in the form of XML comments.

Code comments

I try to avoid these kind of comments at all times. Sure, there is a time and place for comments like these.  But if I do feel the urge of writing a comment for clarifying a piece of code, then I turn back to the code, looking for a better and cleaner way of expressing my intent. If I'm just too stupid and can't find a cleaner way for writing those particular lines of code, then and only then would I consider to put a comment in place (probably feeling bad and miserable for the rest of the day).

Writing a comment clutters the code. Besides that, a comment gets out-of-date sooner then Master Yoda can use his lightsaber. Most developers just don't have the discipline for maintaining comments. It's all about finding the best way to communicate with the future readers of the code. The best way is through the code itself, writing code comments as a miserable second.

XML comments

A few years ago (roughly 25 years on the IT calendar), when I stepped out of the C++ world into the .NET world, I found this tool called NDoc. Back then, this was one of the coolest tools in my tool bag. The entire world was telling me that writing comments was actually a good thing. So I did. I wrote XML comments for practically every method I've put out. Nothing could stop me.

Over the years however, I've come to my senses (it was just my time, it was just my time). Today, I use XML comments sparingly. The only time I ever use XML comments is for writing documentation for public API's that are going to be used by other developers in other teams. Even then I feel bad about it because I almost know for certain that nobody is going to read them.

Again, I pull the card of maintainability here. The code changes, but the XML comments mostly never do. I've also seen too many of these:

/// <summary> /// The first name. /// </summary> public String FirstName { get { return _firstName; } } /// <summary> /// The first name. /// </summary> public String LastName { get { return _lastName; } } /// <summary> /// The first name. /// </summary> public String Company { get { return _company; } }

See what I mean. Notice the clarifying nature of the comments? Notice the implementation of copy-paste driven development? 

So kids, stay away from comments. If you care about what you do as a professional developer, then you try to communicate using lines of beautiful code.

Ciao

Sunday, September 21, 2008

Commented-Out Code and Broken Windows

I know that there are numerous blog posts already written about this topic, but I just can't resist. I've just had it with commented-out code. Sure, everybody agrees that this is evil in it's purest form, but why do I still have to endure so much commented-out code. It's bad for my heart.

Don't trust the source-control system? Get rid of it and replace is with a more reliable source-control system then. Check in more often! Feeling guilty about throwing away code? Too bad, go see a shrink or something. Keep your code clean at all times!

To me, a piece of code that contains commented-out lines is just another piece of crap. Most developers tend to just remove the commented-out lines and get on with it. I'm a bit more radical about this. I tend to throw away the whole shebang, like a tumor that needs to be removed along with it's roots. That's right. I remove all of the surrounding code as well.

Writing software is an act of discipline. Commented-out code is like a broken window. It introduces rot into a software system.

What triggered me to write this blog post besides the fact that I just can't stand commented-out code and have to see so much of it? Well, I was reading the foreword of the book Clean Code which mentioned this again (and also chapter 4). Why does this need to be written down again and again (including this blog post, I know, I know)? It's like periodically publishing articles in the newspapers that leaving dead bodies laying around is on the street is a bad thing. Isn't this just common sense? Not in this industry.

Glad that I've got this from my chest. Again.

Jan, the relieved

Saturday, September 20, 2008

Book Review: Release It! Design and Deploy Production-Ready Software

ReleaseIt I just finished reading Release It!. I know that Ayende is really fond of this book, and I can't blame him.

Like most books I've read over the last couple of years, this book is not about the technology -du-jour. It's about how to design and build software that stays up-and-running in a production environment. This is a really important topic. We like to believe that the environment in which we build and test our software is good enough. But let's face it, it never ever gets close to the final production environment in which our software has to live, despite the massive amounts of effort we and others put in setting up a decent development/acceptance environment. In order to let our software survive in a production environment, we as software engineers need to incorporate some patterns and practices in order to achieve this goal.

Designing our software for production is a whole different ball game though. Normally we aim for passing QA, fulfilling the functional requirements without considering how the resulting software system behaves in production and with how much capacity.

A while ago, I was listening to this episode of .NET Rocks with Udi Dahan about scaling web applications. Also, Colin kindly pointed me to this article on Udi's blog that really struck a nerve. I really like exploring these kind of architectures. It's based on years of experience and that is the kind of stuff I want to learn.

The same goes for this book. It's based on years and years of experience about keeping medium to large sized software systems alive in production. It's full with patterns and anti-patterns of stability, capacity and operations.

I couldn't help but noticing how much the author mentioned the importance of loosely-coupled software designs. Although this sole aspect of a software system alone is probably not enough, my personal experience shows that loosely coupled systems are much more stable in production than a tightly coupled software system.

Tight coupling allows cracks in one part of the system to propagate themselves - or multiply themselves - across layer or system boundaries.

Something I've seen a lot over the past couple of years, and also at my current employer, is when a software system finally makes it into production, the stakeholders consider it finished. As soon as a piece of software (read massive investment) hits a production server, it is left alone and considered an artifact of the past. To me, this is just foolish.

The true birth of a system comes not on the day that design and development begins, or even when the project is conceived, but on the day it launches into production. This is a beginning, not an end. Over time, the system will grow and mature. It will gain new features. It will lose defects, and perhaps gain some too. It will become what it was meant to be, or, even better, it will become what it needs to be. Most of all, it must change. As system that cannot adapt to its environment is stillborn.

Rome didn't get built in one day either. Why should this be the case for a valuable software system? This partly comes back to how tightly coupled a software system is designed.

Anyway, in case you didn't notice by now: go buy and read this book. I have written down so much stuff while reading it, that I probably have enough material for writing blog posts until the end of this year. The book practically pays for itself ;-).

I welcome Release It! to my growing favorite book list. The next book I'll be reading is Clean Code, which got delivered earlier this week. I've been really anxious to read it.

Till next time,

Jan, the literate

Monday, September 15, 2008

Learn Many Architectures, and Choose Among Them

I've just came across the following quote from Release It!: Design and Deploy Production-Ready Software that fully describes the point I tried to make in my previous post on "One-size-fits-all architectures" :

Not every system needs to look like a three-tier application with an Oracle database. Learn many architectural styles, and select the best architecture for the problem at hand.

Here you go. Book review coming soon.

Till next time,

Jan, the book maggot.

Sunday, September 07, 2008

Ventilation of Thoughts

Earlier this week I went back to work after a refreshing (and very much needed) vacation. During this past week I participated in some interesting discussions that made me think about software engineering and our craft in general. I want to share some of my thoughts, so here goes ...

Design == Code

In my latest book review of Agile Principles, Patterns and Practices in C#, I mentioned an article that is included as an appendix named What is Software?. This article, that originally appeared in C++ Journal somewhere in 1992 (!), really got me thinking. Some discussions about software design earlier this week added to my thoughts and made me come to realize that the source code of a software system is also the design.

Software design is not about drawing UML diagrams. UML diagrams may represent a part of the design, but it is definitely NOT the design itself. Sure, we can all draw nice looking UML diagrams and pat ourselves on the back for being smart. Does it actually work what we have drawn? We can stare at UML diagrams for months, discuss them with our co-workers for ages and change them forever.

The only way to find out if what we have drawn actually works is to get our feet wet and start implementing code to back things up.

If the source code is the design of a software system, then this implies that software developers are software designers! This is why agile methodologies make sense. They encourage self-organizing teams where every team member is responsible for the quality of the software system they are building as opposed to traditional waterfall approach. I wrote about this here.

Software engineering vs manufacturing

Also during these discussions, some of the attendees made some comparisons between software engineering and manufacturing or construction. I really don't like this analogy. Besides being pointless, it doesn't apply.

Take airplane manufacturing for example. When building a new type of airplane, aerospace designers create blueprints in order to document how to build such a thing. These blueprints are used by the manufacturing department to build this new type of airplane. In order to find out if their designs make sense, models are built that can be tested in all kinds of situations like wind tunnels. They use these models to see if the airplane that they try to build will actually fly. What if it doesn't fly? What if the wings fall off on nasty weather? They want to know these kind of things before manufacturing starts building these things. But the main reason of this BDUF, is that building and testing these kinds of models is much cheaper than building an airplane right way. That's just common sense, right? When the blueprints are ready and fully tested, then the manufacturing department is put to action and starts building new airplanes.

As even my youngest daughter already knows by now is that waterfall and a BDUF like this has failed miserably in the software industry. Is a BDUF, drawing massive amounts of UML diagrams that much cheaper than creating and testing the code for the software that it represents? Don't think so.

This doesn't mean that we should start coding right away. As mentioned earlier, developers are also designers. Thinking about the design of the software system that we are building before writing the source code is common sense, but as an ongoing process.

One-size-fits-all architectures

I just don't buy it. Something like this got mentioned in a discussion and I don't think its a good idea. I do believe that there are universal design principles to adhere to like SRP, OCP, etc. ... . However, I just don't like the idea that every application has the same architectural needs. Sometimes a data-driven approach is justified. Most of the times, a domain-driven approach is needed. There are lots of differences between these two approaches, so I don't like the idea of choosing one or the other and sticking to this decision regardless of the requirements.

Fear

I've been a software engineer for about 8 years now. During this time, I've seen a lot of decision making based on fear. Fear of making refactorings to existing code, fear of adding new features to an application with an unmaintainable code base, fear of not meeting deadlines, fear of changing requirements, etc. ... I could go on like this.

We've all done it, postponing some refactoring because of some deadline. Writing some quick and dirty code instead of doing the right thing.

I must admit, the first few years as a software developer I was also like this, but didn't liked it that much. A decision made by fear is almost always the wrong one. Fear prevents us from seeing the things in a clear, objective way. So I tried to get out of the habit. If I see a refactoring one week before a deadline, then I just do it if it adds to the maintainability of the system. If I have an epiphany about our implementation of the domain while doing the dishes, then this is something that needs to be done in order to suit the domain model.

When it comes to software engineering, I have just one fear left: fear of not delivering quality software. Building quality software means doing the right thing, even if it means you were not entirely right the first time you entered the code.

Still here? I'm done now.

Till next time,

Jan, the unburdened.