This course will teach you how to write clean code.
Clean code makes bugs less frequent, and easier to detect when they do occur.
It also makes it easier for others to help you with your code. In industry,
writing clean code will ensure that anyone on the team and understand and maintain
it. This increases the value of your code—and you—to the company.
Follow these guidelines all the time—from the beginning.
Rule: Functions that are never called from within a file and are not part of the public interface (i.e., required for homework) must be removed or commented out.
✗
int main() {
return EXIT_SUCCESS;
}
void _foo() { // not called by main(…) or anything else
// ...
}
Reason: Unused functions (including old versions) cause confusion. You may think you're debugging the one that gets called and the realize you're debugging something that never gets called at all.
Rule: Declare and initialize for loop counter in the for statement.
✓
for(int i = 0; i < 5; i++) {
printf("%d\n", i);
}
✗
int i;
for(i = 0; i < 5; i++) {
printf("%d\n", i);
}
✗
int i = 1;
for(; i < 5; i++) {
printf("%d\n", i);
}
Exception: Occassionally, you may need to use the counter in code below the for loop. In that case, declare and initialize outside the for loop (like Bad #1 above) and explain why in a comment.
✓
int divisor = 1; // will be used after the loop
for(; divisor < n; divisor++) {
…
}
return divisor;
Rule: Do not use global variables, except for constants, unless there is a clear and compelling reason.
✗
#include <stdio.h>
int g = 0;
int main(int argc, char* argv[]) {
g += 1;
…
}
✓
#include <stdio.h>
const int NUM_FEET = 2;
int main(int argc, char* argv[]) {
…
}
Note:
In ECE 26400, we will tell you if there is a clear and compelling reason. Otherwise, do not use global variables, except for constants (declared with const).
Rule: Declare local variables on the line where they are first assigned to, or as close as possible.
✗
int i;
i = 0;
✓
int i = 0;
✗
int i;
for(i = 0; i < n; i++) {
▒▒▒▒▒▒▒▒▒▒;
}
✓
for(int i = 0; i < n; i++) {
▒▒▒▒▒▒▒▒▒▒;
}
Exception: In rare occassions, there may be no meaningful value to initialize a variable to. Don't initialize to a meaningless value just to comply with this rule, since that could prevent Valgrind and/or GCC from warning you if you try to access an uninitialized value.
✗
int a = 0; // 👎BAD – Do not initialize to a meaningless placeholder value.
if(▒▒▒) {
a = ▒▒▒;
}
else {
a = ▒▒▒;
}
foo(a)
✓
int a; // 👌
if(▒▒▒) {
a = ▒▒▒;
}
else {
a = ▒▒▒;
}
foo(a)
This is okay because a will be initialized in one of ≥2 subscopes (i.e., the if block or else block).
a cannot be declared inside the if and else block because it will be used below the if/else.
Do not initialize to a placeholder (e.g., 0), unless that would be a legitimate default that should be kept in some cases.
Exception: You cannot declare a variable inside a switch/case statement.
✗
switch(▒) {
case ▒:
int a = 5; // 👎 Compiler error
foo(a);
default:
▒▒▒▒▒▒▒▒▒▒;
}
✓
int a; // 👌
switch(▒) {
case ▒:
a = 5;
foo(a);
default:
▒▒▒▒▒▒▒▒▒▒;
}
A variable that is first set inside a switch/case should be delcared immediately before the switch.
Rule: Use named initializers where possible and appropriate.
✓
struct Point point1 = {.x = 1, .y = 2};
✗
struct Point point1;
point1.x = 1;
point1.y = 2;
✗
struct Point point1 = {1, 2};
Reason: Struct initializers are clearer to read. They also prevent mistakes where programmers sometimes forget to initialize one field, or perhaps edit the code later and inadvertently delete part of the initialization. Bad #2 is actually fine in earlier versions of c (e.g. c89), but the new syntax shown as Good prevents errors where you get the order wrong.
Node* node = malloc(sizeof(*node));
node -> value = 10;
node -> next = NULL;
Reason: Initializing an object is essentially a single action. Keeping it in a single statement reduces the chance that you forget one part of it, or that you move part of it but not the rest.
Rule: Do not use 1- and 2-character names, except as allowed below if it is clear.
Allowed: i, j, k as counters for a 0-based for loop with a short body (≤10 lines).
Allowed: x, y, z as coordinates in an image or physical space.
Allowed: x1, x2, y1, y2, z1, z2 as bounds of a rectangle in an image or physical space.
Allowed: w, h, d as dimensions of an image or physical object/space; width, height, and depth, respectively.
Allowed: r, g, b, a as color components red, greed, blue, and alpha, respectively.
Allowed: n as the primary operand to a generic calculation (e.g., sqrt(…) in small scopes (≤10 lines).
Allowed: n as a number of something in small scopes (≤10 lines).
Allowed: ch or c as a char in a small scope (≤10 lines) where there are no other variables of type char
Allowed: s as a char* in a small scope (≤10 lines) where there are no other variables of type char*.
Allowed: any short, meaningless variable name in a demonstration or explanation of programming language features in which the content of the variable is irrelevant
Reason: The above exceptions are widely understood. Using very short names that are not understood forces the programmer/reader to search for the definition or else assume what they think it means.
Note:
Many in industry and open source, many would find these too loose (i.e., prefer less use of single-letter variables). For the class, we are erring on the permissive side to avoid being too draconian.
Rule: Loop index variables should be named “▒▒▒_idx” for 0-based for loops, or “▒▒▒_num” for 1-based loops, where ▒▒▒ is a singular noun. In small scopes (≤10 lines), “i”, “j”, or “k” are also acceptable for 0-based for loops.
Rule: Variables and struct members that represent a dimension, distance, other measurable quantity should be named with a singular noun or noun phrase their meaning and/or units (if applicable).
Rule: Variables and struct members that represent the number of items in an array, string, or list may be named according to the rules for quantities (above) −OR− “▒▒▒_size”, “▒▒▒_length”, or “▒▒▒_len” (where ▒▒▒ is the name of the array/string/list variable).
Rule: Variables and struct members of type char* that represent a string should be named with a singular noun that describes the contents or role of the string (e.g., person_name) or “▒▒▒_str”. In very small scopes (≤10 lines), s is allowed.
Rule: Variables and struct members that contain a memory address must begin with “a_” (or “a” if you are using lower-camelcase), with a few exceptions described below.
Exception: Do not use the “a_” prefix for strings, list nodes, and other objects primarily referred to by their address. For those types, add the “a_” prefix only when passing by address (e.g., char**, Node**, etc.).
Note:
Outside of this class, you will see “p” or “ptr”. For pedagogical reasons, we use the word “address”.
Rule: Variable name should communicate the type of the variable, and it's role in the computation/algorithm.
Note:
Anybody else in this class (i.e., student, staff, or instructor) should know (or have a good guess) the type of the variable, and how its value contributes to the code it is part of—without hunting for the declaration and reading everywhere the variable is used.
Note:
Variable should never mislead the person reading your code.
Rule: Begin with an underscore and declare as static.
✓
static void _do_something_mysterious() {
}
✓
static const int _MAX_BUFFER_SIZE = 128;
Reason: The underscore warns readers that this code may be hard to understand. It also lets them know that it is safe to change the function without risk of breaking any code outside this file. The static keyword causes the compiler to make the function visible only to code in this file.
Note:
Here's why this matters. If you declare a function as static but do not call it anywhere in the current file, gcc will complain. Usually, this is a good thing, since it can alert you to the presence of dead code. For example, if you have a helper function (e.g., _print_as_currency(int num_cents)) and then make another version of it (e.g., _print_as_currency_v2(…)), in the course of debugging or changing, but don't comment out the old version, you end up with two functions that both appear to do the same thing. When you come back to the code later, you might waste time looking at the old one. If it is declared as static, then gcc will tell you that the old function is defined (as static) but never used. That reminds you to comment out (or remove) the old version (i.e., _print_as_currency(…)) so you or someone else doesn't waste time trying to figure out why its code doesn't match the program's behavior. (Before this rule was added, the instructor often wasted time in office hours trying to help students debug, but looking at an old version of the function that was misbehaving.)
Note:
"Internal" means things that should not be called from outside the current file or context. This rule is more important for larger projects, and is pointless for projects that have only a single file. However, real projects tend to grow, so it is a good habit to build on.
Exception: In your test code (or any .c file containing a main(…) function, prefixing with underscore and/or declaring as static is optional.
✓
int test_simple() {
mu_start();
…
mu_end();
}
int main(int argc, char* argv[]) {
…
}
✓
static int _test_simple() {
mu_start();
…
mu_end();
}
int main(int argc, char* argv[]) {
…
}
Reason: Omitting the static function declaration is allowed for test functions (e.g., test_simple()) because sometimes, you want to temporarily comment out other test functions (e.g., test_hard()) to focus on just one test (e.g., test_simple()).
Reason: Omitting the '_' prefix is allowed for test functions because they are not like typical internal helper functions. The 'test_' prefix already tells the reader what they need to know.
Rule: Begin with an underscore and declare as static const.
✓
static const int _MAX_BUFFER_SIZE = 128;
Reason: This warns readers of your code (including you) that the meaning may be hard to understand. It also lets them know that it may change, so it's best not to refer to it from elsewhere.
Note:
"Internal" means things that should not be called from outside the current file or context. This rule is more important for larger projects, and is pointless for projects that have only a single file. However, real projects tend to grow, so it is a good habit to build on.
Rule: Variables and struct members representing a linked list node (e.g., Node) or address thereof (e.g., Node*) should be should be named “node”, “▒▒▒_node”, “next”, “prev”, “curr”, “head”, or “tail”. The head may also be called “▒▒▒_list”.
Rule: Variables and struct members representing/returning such a tree node (e.g., TreeNode) or address of a tree node (e.g., TreeNode*) should be should be named “node”, “▒▒▒_node”, “root”, “curr”, “head”, “right”, “left”, or “leaf”.
Rule: Do not use the casting operator in any context—including malloc(…)—unless absolutely necessary and safe for reasons that you completely understand well. Do not add a casting operator "just in case," or to silence warnings that you don't understand.
✗
char* s = (char*)malloc(10 * sizeof(*s));
✓
char* s = malloc(10 * sizeof(*s));
✗
char* s = (char*)va_arg(args, char*);
✓
char* s = va_arg(args, char*);
va_arg(…, char*) returns a char*, and thus can be assigned to a char* with no need for a cast.
✓
double num_children_per_family = 1.93;
printf("%d whole children per family.", (int)num_children_per_family);
Necessary because printf(…) requires that the parameter for a %d be an int.
Safe because this value will naturally be small enough to fit in an int
The (int) cast truncates the fractional part (i.e., rounds down).
Rule: Always required. (Do not use inside each case of a switch, but use for the switch itself.)
Reason: Even though C will let you omit the braces if the body is just one line, that is bug-prone. If you later add one more line to the body of your for loop, but don't add curly braces, your code may behave differently from how it appears.
✓
if(a < 1) {
foo();
}
✗
if(a < 1)
foo();
Note:
Apple had a major security bug due to someone omitting the braces (article).