Can someone who knows about OOP give me some cultural context for something like Sandi Metz’s Four Rules?
1. Your class can be no longer than 100 lines of code.
2. Your methods can be no longer than five lines of code.
3. You can pass no more than four parameters and you can’t just make it one big hash.
4. When a call comes into your Rails controller, you can only instantiate one object to do whatever it is that needs to be done. And your view can only know about one instance variable.I’ve never used Ruby at all, and I imagine that the people who do this stuff are doing very different kinds of programming than what I do. But this sounds hellish, mostly because of rule #2. The things I want my program to do are already complicated. The last thing I want to do is force myself to add additional complexity by artificially dividing the task into many tiny pieces spread over many disparate bits of code, each of which will have its own name I whose meaning I am guaranteed to forget later.
(I understand that rules like these are meant to build good habits and should not be literally applied all the time. I just don’t understand how #2 is a good habit.)
hey! I like oop! i took a three-day class from sandy metz once! I am probably at least somewhat qualified to talk about this topic, which is frankly a first for me on tumblr.
The first thing to understand is that these ideas are coming from the perspective of backend web development—specifically, Rails development. Rails has very little boilerplate, and Ruby itself is a very expressive language. Some things that would probably take me 15 lines to write in Java only take me 3 in Ruby, for example. So if you’re working in a different language, inflate those lines by at least 2x. And specifically the cultural context here (as other people have mentioned) is about small team entry-level app development—they’re meant for people who don’t feel very confident writing OO code yet, and keeping a moderately sized codebase clean.
That said, here’s the core idea behind OOP: Code should be readable, composable, and maintainable. We think objects are a good way to achieve those goals.
This first rule is motivated by the concepts of Single Responsibility and Interface Segregation. These say that your class should only be responsible for one thing, and that no consumers of that class should be forced to depend on methods they don’t need. The rule of thumb here is that you should keep your classes (and thus the interfaces those classes expose) very small, to increase the flexibility of the class, avoid making it very brittle to interface with that class, and to increase the ability to reuse/compose the code you write.
The third rule is also about interface segregation. Specifically, it’s about the concept of coupling. If you have a method with lots and lots of parameters, it’s probably due to one of two things: either that method is so highly coupled to another place in your code that no one else is ever going to reasonably be able to consume it—you need 4 of this object, 6 of the other, the moon at three-quarters gibbous, that sort of thing. Or it’s a sign that some of your variables are so closely related—an x and a y coordinate, for example—that you’re almost never going to pass someone one without passing the other, and they should be encapsulated into another object.
The fourth rule is pretty specific to model view controller code so i’m just gonna pass over it for now (unless people really want to hear about it?)
The second rule, undoubtably, is the most controversial. Five lines?? Who can get anything done in five lines? Well, this goes back to the idea that your code should readable and composable. Instead of writing one long function to encompass work you want to do, you should break that work up into a mix of objects and behaviors that you can compose into the method that you want.
Here’s an example from some code I just wrote. It parses a HTML page to extract a list of peer-review assignments for students:
for student in page.select(".student_reviews"): user_id = student.find(class_="student_review_id").text assigned_ids = student.select(".peer_review .user_id") reviews[user_id] = assigned_ids[0].text if assigned_ids else None for i in assigned_ids: reviews_for[i.text].append(user_id) name = student.find(class_="assessor_name").text.split(", ") names[user_id] = " ".join([name[-1]] + name[:-1])This is working code, but it’s a complete pain to read, understand, and reason about. Let’s try to sketch out what a more object oriented version might look like:
class Section(object): @staticmethod def from_page(page): self.page = page def student_dom_objects(self): return page.select(".student_reviews") def students(self): return [Student.from_element(i) for i in self.student_dom_objects()] def reviews(self): return {s.id(): s.reviews() for s in self.students()} def reviews_for(self, student): return [s for s in students() if student.id() in s.reviews()] class Student(object): @staticmethod def from_dom_object(dom_obj): self.dom_obj = dom_obj def id(self): return self.dom_obj.find(class_="student_review_id").text def reviews(self): return [i.text for i in self.dom_obj.select(".peer_review .user_id")] def name_given(self): return self.dom_obj.find(class_="assessor_name").text def name_formatted(self): last, first = self.name_given.split(", ") return " ".join((first, last))There’s a lot that happened here, but the most obvious is that all of our actual code got removed—I elided a Section.from_page(page) call—and now everything is just declarative, instead of imperative. One benefit to that is that all of our methods instantly become shorter. And, because we’re composing methods, it becomes really easy to maintain—there’s only one place to change something, and it’s obvious where it might be. And now we have a clean, easy to use interface that’s easily extendable.
Obviously, there’s more to it then that, but that’s basically my version of the “Intro to OO” pitch. Does that clear things up a little?
Thanks, this is also helpful.
I am wary of composability as an ideal, for reasons I stated here. As you get more and more objects and methods involved in performing a given task, you’re allowing the actual code that performs that task to be spread more and more widely across the code base, and requiring the reader to trace back more steps in order to have full comprehension of what any line is actually doing. And if you want to follow what the code is doing closely, you have to jump around nonlinearly more and more. @gattsuru used the phrase “ravioli code” for this, and Googling it, it seems like other people have made the same complaint, e.g.:
I should have noted why I think that Ravioli Code is a bad thing (and hence that those who think it is good style are doing a disservice to their trade). The problem is that it tends to lead to functions (methods, etc.) without true coherence, and it often leaves the code to implement even something fairly simple scattered over a very large number of functions. Anyone having to maintain the code has to understand how all the calls between all the bits work, recreating almost all the badness of Spaghetti Code except with function calls instead of GOTO. It is far better to ensure that each function has a strong consistent description (e.g. “this function frobnicates the foobar”, which you should attach to the function somehow - in C, by a comment because there’s no stronger metadata scheme) rather than splitting it up into smaller pieces (“stage 1 of preparing to frobnicate the foo part of the foobar”, etc.) with less coherence. The principal reason why this is better is precisely that it makes the code easier overall to understand.
Of course, it makes sense to group things together if they actually tend to get re-used together (like having x and y coordinates be attributes of the same object), or to put some code in a function of its own if it forms a distinct conceptual block. But my understanding is that the “ifs” in the previous sentence should be read as “if-and-only-ifs”; the point of abstractions like functions and objects is to group some things together as well as to separate them from other things, in order to exploit actual regularities of the task or conceptual domain in your code.
If, instead, you try to make everything as modular as possible all the time, you’re no longer making useful “these form a group, apart from these other things” distinctions; you’re just splitting everything up as finely as it possibly can be.
