This is a printer-friendly version. It omits exercises, optional topics (i.e., four-star topics), and other extra content such as learning outcomes.
The following description of the Joys of the Programming Craft was taken from Chapter 1 of the famous book The Mythical Man-Month, by Frederick P. Brooks.
Why is programming fun? What delights may its practitioner expect as his reward?
First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys building things, especially things of his own design. I think this delight must be an image of God's delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.
Second is the pleasure of making things that are useful to other people. Deep within, you want others to use your work and to find it helpful. In this respect the programming system is not essentially different from the child's first clay pencil holder "for Daddy's office."
Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.
Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.
Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by the exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures....
Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.
Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities you have in common with all men.
Not all is delight, however, and knowing the inherent woes makes it easier to bear them when they appear.
First, one must perform perfectly. The computer resembles the magic of legend in this respect, too. If one character, one pause, of the incantation is not strictly in proper form, the magic doesn't work. Human beings are not accustomed to being perfect, and few areas of human activity demand it. Adjusting to the requirement for perfection is, I think, the most difficult part of learning to program.
Next, other people set one's objectives, provide one's resources, and furnish one's information. One rarely controls the circumstances of his work, or even its goal. In management terms, one's authority is not sufficient for his responsibility. It seems that in all fields, however, the jobs where things get done never have formal authority commensurate with responsibility. In practice, actual (as opposed to formal) authority is acquired from the very momentum of accomplishment.
The dependence upon others has a particular case that is especially painful for the system programmer. He depends upon other people's programs. These are often maldesigned, poorly implemented, incompletely delivered (no source code or test cases), and poorly documented. So he must spend hours studying and fixing things that in an ideal world would be complete, available, and usable.
The next woe is that designing grand concepts is fun; finding nitty little bugs is just work. With any creative activity come dreary hours of tedious, painstaking labor, and programming is no exception.
Next, one finds that debugging has a linear convergence, or worse, where one somehow expects a quadratic sort of approach to the end. So testing drags on and on, the last difficult bugs taking more time to find than the first.
The last woe, and sometimes the last straw, is that the product over which one has labored so long appears to be obsolete upon (or before) completion. Already colleagues and competitors are in hot pursuit of new and better ideas. Already the displacement of one's thought-child is not only conceived, but scheduled.
This always seems worse than it really is. The new and better product is generally not available when one completes his own; it is only talked about. It, too, will require months of development. The real tiger is never a match for the paper one, unless actual use is wanted. Then the virtues of reality have a satisfaction all their own.
Of course the technological base on which one builds is always advancing. As soon as one freezes a design, it becomes obsolete in terms of its concepts. But implementation of real products demands phasing and quantizing. The obsolescence of an implementation must be measured against other existing implementations, not against unrealized concepts. The challenge and the mission are to find real solutions to real problems on actual schedules with available resources.
This then is programming, both a tar pit in which many efforts have floundered and a creative activity with joys and woes all its own. For many, the joys far outweigh the woes....
A software requirement specifies a need to be fulfilled by the software product.
A software project may be,
In either case, requirements need to be gathered, analyzed, specified, and managed.
Requirements come from stakeholders.
Stakeholder: A party that is potentially affected by the software project. e.g. users, sponsors, developers, interest groups, government agencies, etc.
Identifying requirements is often not easy. For example, stakeholders may not be aware of their precise needs, may not know how to communicate their requirements correctly, may not be willing to spend effort in identifying requirements, etc.
Requirements can be divided into two in the following way:
Some examples of non-functional requirement categories:
You may have to spend an extra effort in digging NFRs out as early as possible because,
Here are some characteristics of well-defined requirements [📖 zielczynski]:
Besides these criteria for individual requirements, the set of requirements as a whole should be
Requirements can be prioritized based on the importance and urgency, while keeping in mind the constraints of schedule, budget, staff resources, quality goals, and other constraints.
A common approach is to group requirements into priority categories. Note that all such scales are subjective, and stakeholders define the meaning of each level in the scale for the project at hand.
An example scheme for categorizing requirements:
Essential
: The product must have this requirement fulfilled or else it does not get user acceptance.Typical
: Most similar systems have this feature although the product can survive without it.Novel
: New features that could differentiate this product from the rest.Other schemes:
High
, Medium
, Low
Must-have
, Nice-to-have
, Unlikely-to-have
Level 0
, Level 1
, Level 2
, ...Some requirements can be discarded if they are considered ‘out of ’.
The requirement given below is for a Calendar application. Stakeholders of the software (e.g. product designers) might decide the following requirement is not in the scope of the software.
The software records the actual time taken by each task and show the difference between the actual and scheduled time for the task.
Brainstorming: A group activity designed to generate a large number of diverse and creative ideas for the solution of a problem.
In a brainstorming session there are no "bad" ideas. The aim is to generate ideas; not to validate them. Brainstorming encourages you to "think outside the box" and put "crazy" ideas on the table without fear of rejection.
Surveys can be used to solicit responses and opinions from a large number of stakeholders regarding a current product or a new product.
Observing users in their natural work environment can uncover product requirements. Usage data of an existing system can also be used to gather information about how an existing system is being used, which can help in building a better replacement e.g. to find the situations where the user makes mistakes when using the current system.
Interviewing stakeholders and domain experts can produce useful information about project requirements.
[source]
Focus groups are a kind of informal interview within an interactive group setting. A group of people (e.g. potential users, beta testers) are asked about their understanding of a specific issue, process, product, advertisement, etc.
Prototype: A prototype is a mock up, a scaled down version, or a partial system constructed
Prototyping can uncover requirements, in particular, those related to how users interact with the system. UI prototypes or mock ups are often used in brainstorming sessions, or in meetings with the users to get quick feedback from them.
A mock up (also called a wireframe diagram) of a dialog box:
[source: plantuml.com]
Prototyping can be used for discovering as well as specifying requirements e.g. a UI prototype can serve as a specification of what to build.
Studying existing products can unearth shortcomings of existing solutions that can be addressed by a new product. Product manuals and other forms of documentation of an existing system can tell us how the existing solutions work.
When developing a game for a mobile device, a look at a similar PC game can give insight into the kind of features and interactions the mobile game can offer.
A textual description (i.e. prose) can be used to describe requirements. Prose is especially useful when describing abstract ideas such as the vision of a product.
The product vision of the TEAMMATES Project given below is described using prose.
TEAMMATES aims to become the biggest student project in the world (biggest here refers to 'many contributors, many users, large code base, evolving over a long period'). Furthermore, it aims to serve as a training tool for Software Engineering students who want to learn SE skills in the context of a non-trivial real software product.
Avoid using lengthy prose to describe requirements; they can be hard to follow.
Feature list: A list of features of a product grouped according to some criteria such as aspect, priority, order of delivery, etc.
A sample feature list from a simple Minesweeper game (only a brief description has been provided to save space):
User story: User stories are short, simple descriptions of a feature told from the perspective of the person who desires the new capability, usually a user or customer of the system. [Mike Cohn]
A common format for writing user stories is:
User story format: As a {user type/role} I can {function} so that {benefit}
Examples (from a Learning Management System):
You can write user stories on index cards or sticky notes, and arrange them on walls or tables, to facilitate planning and discussion. Alternatively, you can use a software (e.g., GitHub Project Boards, Trello, Google Docs, ...) to manage user stories digitally.
The {benefit}
can be omitted if it is obvious.
As a user, I can login to the system so that I can access my data
It is recommended to confirm there is a concrete benefit even if you omit it from the user story. If not, you could end up adding features that have no real benefit.
You can add more characteristics to the {user role}
to provide more context to the user story.
You can write user stories at various levels. High-level user stories, called epics (or themes) cover bigger functionality. You can then break down these epics to multiple user stories of normal size.
[Epic] As a lecturer, I can monitor student participation levels
You can add conditions of satisfaction to a user story to specify things that need to be true for the user story implementation to be accepted as ‘done’.
As a lecturer, I can view the forum post count of each student so that I can identify the activity level of students in the forum.
Conditions:
Separate post count for each forum should be shown
Total post count of a student should be shown
The list should be sortable by student name and post count
Other useful info that can be added to a user story includes (but not limited to)
User stories capture user requirements in a way that is convenient for , , and .
[User stories] strongly shift the focus from writing about features to discussing them. In fact, these discussions are more important than whatever text is written. [Mike Cohn, MountainGoat Software 🔗]
User stories differ from mainly in the level of detail. User stories should only provide enough details to make a reasonably low risk estimate of how long the user story will take to implement. When the time comes to implement the user story, the developers will meet with the customer face-to-face to work out a more detailed description of the requirements. [more...]
User stories can capture non-functional requirements too because even NFRs must benefit some stakeholder.
An example of an NFR captured as a user story:
As a | I want to | so that |
---|---|---|
impatient user | to be able experience reasonable response time from the website while up to 1000 concurrent users are using it | I can use the app even when the traffic is at the maximum expected level |
Given their lightweight nature, user stories are quite handy for recording requirements during early stages of requirements gathering.
Given below is a possible recipe you can take when using user stories for early stages of requirement gathering.
Step 0: Clear your mind of preconceived product ideas
Even if you already have some idea of what your product will look/behave like in the end, clear your mind of those ideas. The product is the solution. At this point, we are still at the stage of figuring out the problem (i.e., user requirements). Let's try to get from the problem to the solution in a systematic way, one step at a time.
Step 1: Define the target user as a persona:
Decide your target user's profile (e.g. a student, office worker, programmer, salesperson) and work patterns (e.g. Does he work in groups or alone? Does he share his computer with others?). A clear understanding of the target user will help when deciding the importance of a user story. You can even narrow it down to a persona. Here is an example:
Jean is a university student studying in a non-IT field. She interacts with a lot of people due to her involvement in university clubs/societies. ...
Step 2: Define the problem scope:
Decide the exact problem you are going to solve for the target user. It is also useful to specify what related problems it will not solve so that the exact scope is clear.
ProductX helps Jean keep track of all her school contacts. It does not cover communicating with contacts.
Step 3: List scenarios to form a narrative:
Think of the various scenarios your target user is likely to go through as she uses your app. Following a chronological sequence as if you are telling a story might be helpful.
A. First use:
- Jean gets to know about ProductX. She downloads it and launches it to check out what it can do.
- After playing around with the product for a bit, Jean wants to start using it for real.
- ...
B. Second use: (Jean is still a beginner)
- Jean launches ProductX. She wants to find ...
- ...
C. 10th use: (Jean is a little bit familiar with the app)
- ...
D. 100th use: (Jean is an expert user)
- Jean launches the app and does ... and ... followed by ... as per her usual habit.
- Jean feels some of the data in the app are no longer needed. She wants to get rid of them to reduce clutter.
More examples that might apply to some products:
- Jean uses the app at the start of the day to ...
- Jean uses the app before going to sleep to ...
- Jean hasn't used the app for a while because she was on a three-month training programme. She is now back at work and wants to resume her daily use of the app.
- Jean moves to another company. Some of her clients come with her but some don't.
- Jean starts freelancing in her spare time. She wants to keep her freelancing clients separate from her other clients.
Step 4: List the user stories to support the scenarios:
Based on the scenarios, decide on the user stories you need to support. For example, based on the scenario 'A. First use', you might have user stories such as these:
- As a potential user exploring the app, I can see the app populated with sample data, so that I can easily see how the app will look like when it is in use.
- As a user ready to start using the app, I can purge all current data, so that I can get rid of sample/experimental data I used for exploring the app.
To give another example, based on the scenario 'D. 100th use', you might have user stories such as these:
- As an expert user, I can create shortcuts for tasks, so that I can save time on frequently performed tasks.
- As a long-time user, I can archive/hide unused data, so that I am not distracted by irrelevant data.
Do not 'evaluate' the value of user stories while brainstorming. Reason: an important aspect of brainstorming is not judging the ideas generated.
Other tips:
As a user, I want to see a list of tasks that need my attention most at the present time, so that I pay attention to them first.
While use cases can be recorded on in the initial stages, an online tool is more suitable for longer-term management of user stories, especially if the team is not .
Use case: A description of a set of sequences of actions, including variants, that a system performs to yield an observable result of value to an actor [ 📖 : ].
A use case describes an interaction between the user and the system for a specific functionality of the system.
UML includes a diagram type called use case diagrams that can illustrate use cases of a system visually, providing a visual ‘table of contents’ of the use cases of a system.
In the example on the right, note how use cases are shown as ovals and user roles relevant to each use case are shown as stick figures connected to the corresponding ovals.
Use cases capture the functional requirements of a system.
A use case is an interaction between a system and its actors.
Actors in Use Cases
Actor: An actor (in a use case) is a role played by a user. An actor can be a human or another system. Actors are not part of the system; they reside outside the system.
Some example actors for a Learning Management System:
A use case can involve multiple actors.
An actor can be involved in many use cases.
A single person/system can play many roles.
Many persons/systems can play a single role.
Use cases can be specified at various levels of detail.
Consider the three use cases given below. Clearly, (a) is at a higher level than (b) and (b) is at a higher level than (c).
While modeling user-system interactions,
Writing use case steps
The main body of the use case is a sequence of steps that describes the interaction between the system and the actors. Each step is given as a simple statement describing who does what.
An example of the main body of a use case.
A use case describes only the externally visible behavior, not internal details, of a system i.e. should minimize details that are not part of the interaction between the user and the system.
This example use case step refers to behaviors not externally visible.
A step gives the intention of the actor (not the mechanics). That means UI details are usually omitted. The idea is to leave as much flexibility to the UI designer as possible. That is, the use case specification should be as general as possible (less specific) about the UI.
The first example below is not a good use case step because it contains UI-specific details. The second one is better because it omits UI-specific details.
Bad : User right-clicks the text box and chooses ‘clear’
Good : User clears the input
A use case description can show loops too.
An example of how you can show a loop:
Software System: SquareGame
Use case: - Play a Game
Actors: Player (multiple players)
MSS:
Use case ends.
The Main Success Scenario (MSS) describes the most straightforward interaction for a given use case, which assumes that nothing goes wrong. This is also called the Basic Course of Action or the Main Flow of Events of a use case.
Note how the MSS in the example below assumes that all entered details are correct and ignores problems such as timeouts, network outages etc. For example, the MSS does not tell us what happens if the user enters incorrect data.
System: Online Banking System (OBS)
Use case: UC23 - Transfer Money
Actor: User
MSS:
Use case ends.
Extensions are "add-on"s to the MSS that describe exceptional/alternative flow of events. They describe variations of the scenario that can happen if certain things are not as expected by the MSS. Extensions appear below the MSS.
This example adds some extensions to the use case in the previous example.
System: Online Banking System (OBS) Use case: UC23 - Transfer Money Actor: User MSS: 1. User chooses to transfer money. 2. OBS requests for details of the transfer. 3. User enters the requested details. 4. OBS requests for confirmation. 5. User confirms. 6. OBS transfers the money and displays the new account balance. Use case ends.
Extensions: 3a. OBS detects an error in the entered data. 3a1. OBS requests for the correct data. 3a2. User enters new data. Steps 3a1-3a2 are repeated until the data entered are correct. Use case resumes from step 4. 3b. User requests to effect the transfer in a future date. 3b1. OBS requests for confirmation. 3b2. User confirms future transfer. Use case ends. *a. At any time, User chooses to cancel the transfer. *a1. OBS requests to confirm the cancellation. *a2. User confirms the cancellation. Use case ends. *b. At any time, 120 seconds lapse without any input from the User. *b1. OBS cancels the transfer. *b2. OBS informs the User of the cancellation. Use case ends.
Note that the numbering style is not a universal rule but a widely used convention. Based on that convention,
3a.
and 3b.
can happen just after step 3
of the MSS.*a.
can happen at any step (hence, the *
).When separating extensions from the MSS, keep in mind that the MSS should be self-contained. That is, the MSS should give us a complete usage scenario.
Also note that it is not useful to mention events such as power failures or system crashes as extensions because the system cannot function beyond such catastrophic failures.
In use case diagrams you can use the <<extend>>
arrows to show extensions. Note the direction of the arrow is from the extension to the use case it extends and the arrow uses a dashed line.
A use case can include another use case. Underlined text is used to show an inclusion of a use case.
This use case includes two other use cases, one in step 1 and one in step 2.
Inclusions are useful,
You use a dotted arrow and an <<include>>
annotation to show use case inclusions in a use case diagram. Note how the arrow direction is different from the <<extend>>
arrows.
Preconditions specify the specific state you expect the system to be in before the use case starts.
Software System: Online Banking System
Use case: UC23 - Transfer Money
Actor: User
Preconditions: User is logged in
MSS:
Guarantees specify what the use case promises to give us at the end of its operation.
Software System: Online Banking System
Use case: UC23 - Transfer Money
Actor: User
Preconditions: User is logged in.
Guarantees:
MSS:
You can use actor generalization in use case diagrams using a symbol similar to that of UML notation for inheritance.
In this example, actor Blogger
can do all the use cases the actor Guest
can do, as a result of the actor generalization relationship given in the diagram.
Do not over-complicate use case diagrams by trying to include everything possible. A use case diagram is a brief summary of the use cases that is used as a starting point. Details of the use cases can be given in the use case descriptions.
Some include ‘System’ as an actor to indicate that something is done by the system itself without being initiated by a user or an external system.
The diagram below can be used to indicate that the system generates daily reports at midnight.
However, others argue that only use cases providing value to an external user/system should be shown in the use case diagram. For example, they argue that view daily report
should be the use case and generate daily report
is not to be shown in the use case diagram because it is simply something the system has to do to support the view daily report
use case.
You are recommended to follow the latter view (i.e. not to use System as a user). Limit use cases for modeling behaviors that involve an external actor.
UML is not very specific about the text contents of a use case. Hence, there are many styles for writing use cases. For example, the steps can be written as a continuous paragraph.
Use cases should be easy to read. Note that there is no strict rule about writing all details of all steps or a need to use all the elements of a use case.
There are some advantages of documenting system requirements as use cases:
One of the main disadvantages of use cases is that they are not good for capturing requirements that do not involve a user interacting with the system. Hence, they should not be used as the sole means to specify requirements.
Glossary: A glossary serves to ensure that all stakeholders have a common understanding of the noteworthy terms, abbreviations, acronyms etc.
Here is a partial glossary from a variant of the Snakes and Ladders game:
Design is the creative process of transforming the problem into a solution; the solution is also called design. -- 📖 Software Engineering Theory and Practice, Shari Lawrence; Atlee, Joanne M. Pfleeger
Software design has two main aspects:
Abstraction is a technique for dealing with complexity. It works by establishing a level of complexity we are interested in, and suppressing the more complex details below that level.
The guiding principle of abstraction is that only details that are relevant to the current perspective or the task at hand need to be considered. As most programs are written to solve complex problems involving large amounts of intricate details, it is impossible to deal with all these details at the same time. That is where abstraction can help.
Data abstraction: abstracting away the lower level data items and thinking in terms of bigger entities
Within a certain software component, you might deal with a user data type, while ignoring the details contained in the user data item such as name, and date of birth. These details have been ‘abstracted away’ as they do not affect the task of that software component.
Control abstraction: abstracting away details of the actual control flow to focus on tasks at a higher level
print(“Hello”)
is an abstraction of the actual output mechanism within the computer.
Abstraction can be applied repeatedly to obtain progressively higher levels of abstraction.
An example of different levels of data abstraction: a File
is a data item that is at a higher level than an array and an array is at a higher level than a bit.
An example of different levels of control abstraction: execute(Game)
is at a higher level than print(Char)
which is at a higher level than an Assembly language instruction MOV
.
Abstraction is a general concept that is not limited to just data or control abstractions.
Some more general examples of abstraction:
A model is a representation of something else.
A class diagram is a model that represents a software design.
A model provides a simpler view of a complex entity because a model captures only a selected aspect. This omission of some aspects implies models are abstractions.
A class diagram captures the structure of the software design but not the behavior.
Multiple models of the same entity may be needed to capture it fully.
In addition to a class diagram (or even multiple class diagrams), a number of other diagrams may be needed to capture various interesting aspects of the software.
In software development, models are useful in several ways:
a) To analyze a complex entity related to software development.
Some examples of using models for analysis:
b) To communicate information among stakeholders. Models can be used as a visual aid in discussions and documentation.
Some examples of using models to communicate:
c) As a blueprint for creating software. Models can be used as instructions for building software.
Some examples of using models as blueprints:
The following diagram uses the class diagram notation to show the different types of UML diagrams.
source:https://en.wikipedia.org/
Object-Oriented Programming (OOP) is a programming paradigm. A programming paradigm guides programmers to analyze programming problems, and structure programming solutions, in a specific way.
Programming languages have traditionally divided the world into two parts—data and operations on data. Data is static and immutable, except as the operations may change it. The procedures and functions that operate on data have no lasting state of their own; they’re useful only in their ability to affect data.
This division is, of course, grounded in the way computers work, so it’s not one that you can easily ignore or push aside. Like the equally pervasive distinctions between matter and energy and between nouns and verbs, it forms the background against which you work. At some point, all programmers—even object-oriented programmers—must lay out the data structures that their programs will use and define the functions that will act on the data.
With a procedural programming language like C, that’s about all there is to it. The language may offer various kinds of support for organizing data and functions, but it won’t divide the world any differently. Functions and data structures are the basic elements of design.
Object-oriented programming doesn’t so much dispute this view of the world as restructure it at a higher level. It groups operations and data into modular units called objects and lets you combine objects into structured networks to form a complete program. In an object-oriented programming language, objects and object interactions are the basic elements of design.
Some other examples of programming paradigms are:
Paradigm | Programming Languages |
---|---|
Procedural Programming paradigm | C |
Functional Programming paradigm | F#, Haskell, Scala |
Logic Programming paradigm | Prolog |
Some programming languages support multiple paradigms.
Java is primarily an OOP language but it supports limited forms of functional programming and it can be used to (although not recommended) write procedural code. e.g. se-edu/addressbook-level1
JavaScript and Python support functional, procedural, and OOP programming.
Every object has both state (data) and behavior (operations on data). In that, they’re not much different from ordinary physical objects. It’s easy to see how a mechanical device, such as a pocket watch or a piano, embodies both state and behavior. But almost anything that’s designed to do a job does, too. Even simple things with no moving parts such as an ordinary bottle combine state (how full the bottle is, whether or not it’s open, how warm its contents are) with behavior (the ability to dispense its contents at various flow rates, to be opened or closed, to withstand high or low temperatures).
It’s this resemblance to real things that gives objects much of their power and appeal. They can not only model components of real systems, but equally as well fulfill assigned roles as components in software systems.
Object Oriented Programming (OOP) views the world as a network of interacting objects.
A real world scenario viewed as a network of interacting objects:
You are asked to find out the average age of a group of people Adam, Beth, Charlie, and Daisy. You take a piece of paper and pen, go to each person, ask for their age, and note it down. After collecting the age of all four, you enter it into a calculator to find the total. And then, use the same calculator to divide the total by four, to get the average age. This can be viewed as the objects You
, Pen
, Paper
, Calculator
, Adam
, Beth
, Charlie
, and Daisy
interacting to accomplish the end result of calculating the average age of the four persons. These objects can be considered as connected in a certain network of certain structure.
OOP solutions try to create a similar object network inside the computer’s memory – a sort of virtual simulation of the corresponding real world scenario – so that a similar result can be achieved programmatically.
OOP does not demand that the virtual world object network follow the real world exactly.
Our previous example can be tweaked a bit as follows:
Main
to represent your role in the scenario.Pen
and Paper
with an object called AgeList
that is able to keep a list of ages.Every object has both state (data) and behavior (operations on data).
Object | Real World? | Virtual World? | Example of State (i.e. Data) | Examples of Behavior (i.e. Operations) |
---|---|---|---|---|
Adam | Name, Date of Birth | Calculate age based on birthday | ||
Pen | - | Ink color, Amount of ink remaining | Write | |
AgeList | - | Recorded ages | Give the number of entries, Accept an entry to record | |
Calculator | Numbers already entered | Calculate the sum, divide | ||
You/Main | Average age, Sum of ages | Use other objects to calculate |
Every object has an interface and an implementation.
Every real world object has:
The interface and implementation of some real-world objects in our example:
Similarly, every object in the virtual world has an interface and an implementation.
The interface and implementation of some virtual-world objects in our example:
Adam
: the interface might have a method getAge(Date asAt)
; the implementation of that method is not visible to other objects.Objects interact by sending messages. Both real world and virtual world object interactions can be viewed as objects sending messages to each other. The message can result in the sender object receiving a response and/or the receiver object’s state being changed. Furthermore, the result can vary based on which object received the message, even if the message is identical (see rows 1 and 2 in the example below).
Examples:
World | Sender | Receiver | Message | Response | State Change |
---|---|---|---|---|---|
Real | You | Adam | "What is your name?" | "Adam" | - |
Real | as above | Beth | as above | "Beth" | - |
Real | You | Pen | Put nib on paper and apply pressure | Makes a mark on your paper | Ink level goes down |
Virtual | Main | Calculator (current total is 50) | add(int i): int i = 23 | 73 | total = total + 23 |
The concept of Objects in OOP is an abstraction mechanism because it allows us to abstract away the lower level details and work with bigger granularity entities i.e. ignore details of data formats and the method implementation details and work at the level of objects.
You can deal with a Person
object that represents the person Adam and query the object for Adam's age instead of dealing with details such as Adam’s date of birth (DoB), in what format the DoB is stored, the algorithm used to calculate the age from the DoB, etc.
Encapsulation protects an implementation from unintended actions and from inadvertent access.
-- Object-Oriented Programming with Objective-C, Apple
An object is an encapsulation of some data and related behavior in terms of two aspects:
1. The packaging aspect: An object packages data and related behavior together into one self-contained unit.
2. The information hiding aspect: The data in an object is hidden from the outside world and are only accessible using the object's interface.
Writing an OOP program is essentially writing instructions that the computer will use to,
A class contains instructions for creating a specific kind of objects. It turns out sometimes multiple objects keep the same type of data and have the same behavior because they are of the same kind. Instructions for creating a 'kind' (or ‘class’) of objects can be done once and those same instructions can be used to objects of that kind. You call such instructions a Class.
Classes and objects in an example scenario
Consider the example of writing an OOP program to calculate the average age of Adam, Beth, Charlie, and Daisy.
Instructions for creating objects Adam
, Beth
, Charlie
, and Daisy
will be very similar because they are all of the same kind: they all represent ‘persons’ with the same interface, the same kind of data (i.e. name
, dateOfBirth
, etc.), and the same kind of behavior (i.e. getAge(Date)
, getName()
, etc.). Therefore, you can have a class called Person
containing instructions on how to create Person
objects and use that class to instantiate objects Adam
, Beth
, Charlie
, and Daisy
.
Similarly, you need classes AgeList
, Calculator
, and Main
classes to instantiate one each of AgeList
, Calculator
, and Main
objects.
Class | Objects |
---|---|
Person | objects representing Adam, Beth, Charlie, Daisy |
AgeList | an object to represent the age list |
Calculator | an object to do the calculations |
Main | an object to represent you (i.e., the one who manages the whole operation) |
While all objects of a class have the same attributes, each object has its own copy of the attribute value.
All Person
objects have the name
attribute but the value of that attribute varies between Person
objects.
However, some attributes are not suitable to be maintained by individual objects. Instead, they should be maintained centrally, shared by all objects of the class. They are like ‘global variables’ but attached to a specific class. Such variables whose value is shared by all instances of a class are called class-level attributes.
The attribute totalPersons
should be maintained centrally and shared by all Person
objects rather than copied at each Person
object.
Similarly, when a normal method is being called, a message is being sent to the receiving object and the result may depend on the receiving object.
Sending the getName()
message to the Adam
object results in the response "Adam"
while sending the same message to the Beth
object results in the response "Beth"
.
However, there can be methods related to a specific class but not suitable for sending messages to a specific object of that class. Such methods that are called using the class instead of a specific instance are called class-level methods.
The method getTotalPersons()
is not suitable to send to a specific Person
object because a specific object of the Person
class should not have to know about the total number of Person
objects.
Class-level attributes and methods are collectively called class-level members (also called static members sometimes because some programming languages use the keyword static
to identify class-level members). They are to be accessed using the class name rather than an instance of the class.
Objects in an OO solution need to be connected to each other to form a network so that they can interact with each other. Such connections between objects are called associations.
Suppose an OOP program for managing a learning management system creates an object structure to represent the related objects. In that object structure you can expect to have associations between a Course
object that represents a specific course and Student
objects that represent students taking that course.
Associations in an object structure can change over time.
To continue the previous example, the associations between a Course
object and Student
objects can change as students enroll in the module or drop the module over time.
Associations among objects can be generalized as associations between the corresponding classes too.
In our example, as some Course
objects can have associations with some Student
objects, you can view it as an association between the Course
class and the Student
class.
You use instance level variables to implement associations.
When two classes are linked by an association, it does not necessarily mean the two objects taking part in an instance of the association knows about (i.e., has a reference to) each other. The concept of which object in the association knows about the other object is called navigability.
Navigability can be unidirectional or bidirectional. Suppose there is an association between the classes Box
and Rope
, and the Box
object b
and the Rope
object r
is taking part in one instance of that association.
Box
to Rope
, b
will have a reference to r
but r
will not have a reference to b
. Similarly, if the navigability is in the other direction, r
will have a reference to b
but b
will not have a reference to r
.b
will have a reference to r
and r
will have a reference to b
i.e., the two objects will be pointing to each other for the same single instance of the association.Note that two unidirectional associations in opposite directions do not add up to a single bidirectional association.
In the code below, there is a bidirectional association between the Person
class and the Cat
class i.e., if Person
p
is the owner of the Cat
c
, p
it will result in p
and c
having references to each other.
class Person {
Cat pet;
//...
}
class Cat{
Person owner;
//...
}
class Person:
def __init__(self):
self.pet = None # a Cat object
class Cat:
def __init__(self):
self.owner = None # a Person object
The code below has two unidirectional associations between the Person
class and the Cat
class (in opposite directions) because the breeder is not necessarily the same person keeping the cat as a pet i.e., there are two separate associations here, which rules out it being a bidirectional association.
class Person {
Cat pet;
//...
}
class Cat{
Person breeder;
//...
}
class Person:
def __init__(self):
self.pet = None # a Cat object
class Cat:
def __init__(self):
self.breeder = None # a Person object
Multiplicity is the aspect of an OOP solution that dictates how many objects take part in each association.
The multiplicity of the association between Course
objects and Student
objects tells you how many Course
objects can be associated with one Student
object and vice versa.
A normal instance-level variable gives us a 0..1
multiplicity (also called optional associations) because a variable can hold a reference to a single object or null
.
In the code below, the Logic
class has a variable that can hold 0..1
i.e., zero or one Minefield
objects.
class Logic {
Minefield minefield;
// ...
}
class Minefield {
//...
}
class Logic:
def __init__(self):
self.minefield = None
# ...
class Minefield:
# ...
A variable can be used to implement a 1
multiplicity too (also called compulsory associations).
In the code below, the Logic
class will always have a ConfigGenerator
object, provided the variable is not set to null
at some point.
class Logic {
ConfigGenerator cg = new ConfigGenerator();
...
}
class Logic:
def __init__(self):
self.config_gen = ConfigGenerator()
Bidirectional associations require matching variables in both classes.
In the code below, the Foo
class has a variable to hold a Bar
object and vice versa i.e., each object can have an association with an object of the other type.
class Foo {
Bar bar;
//...
}
class Bar {
Foo foo;
//...
}
class Foo:
def __init__(self, bar):
self.bar = bar;
class Bar:
def __init__(self, foo):
self.foo = foo;
To implement other multiplicities, choose a suitable data structure such as Arrays, ArrayLists, HashMaps, Sets, etc.
This code uses a two-dimensional array to implement a 1-to-many association from the Minefield
to Cell
.
class Minefield {
Cell[][] cell;
...
}
class Minefield:
def __init__(self):
self.cells = {1:[], 2:[], 3:[]}
In the context of OOP associations, a dependency is a need for one class to depend on another without having a direct association in the same direction. Reason for the exclusion: If there is an association from class Foo
to class Bar
(i.e., navigable from Foo
to Bar
), that means Foo
is obviously dependent on Bar
and hence there is no point in mentioning dependency specifically. In other words, we are only concerned about non-obvious dependencies here. One cause of such dependencies is interactions between objects that do not have a long-term link between them.
A Course
class can have a dependency on a Registrar
class because the Course
class needs to refer to the Registrar
class to obtain the the maximum number of students it can support (e.g., Registrar.MAX_COURSE_CAPACITY
).
In the code below, Foo
has a dependency on Bar
but it is not an association because it is only a interaction and there is no long term relationship between a Foo
object and a Bar
object. i.e. the Foo
object does not keep the Bar
object it receives as a parameter.
class Foo {
int calculate(Bar bar) {
return bar.getValue();
}
}
class Bar {
int value;
int getValue() {
return value;
}
}
class Foo:
def calculate(self, bar):
return bar.value;
class Bar:
def __init__(self, value):
self.value = value
A composition is an association that represents a strong whole-part relationship. When the whole is destroyed, parts are destroyed too i.e., the part should not exist without being attached to a whole.
A Board
(used for playing board games) consists of Square
objects.
Composition also implies that there cannot be cyclical links.
The ‘sub-folder’ association between Folder
objects is a composition type association. That means if the Folder
object foo
is a sub-folder of Folder
object bar
, bar
cannot be a sub-folder of foo
.
Whether a relationship is a composition can depend on the context.
Is the relationship between Email
and EmailSubject
composition? That is, is the email subject part of an email to the extent that an email subject cannot exist without an email?
A common use of composition is when parts of a big class are carved out as smaller classes for the ease of managing the internal design. In such cases, the classes extracted out still act as parts of the bigger class and the outside world has no business knowing about them.
Cascading deletion alone is not sufficient for composition. Suppose there is a design in which Person
objects are attached to Task
objects and the former get deleted whenever the latter is deleted. This fact alone does not mean there is a composition relationship between the two classes. For it to be composition, a Person
must be an integral part of a Task
in the context of that association, at the concept level (not simply at implementation level).
Identifying and keeping track of composition relationships in the design has benefits such as helping to maintain the data integrity of the system. For example, when you know that a certain relationship is a composition, you can take extra care in your implementation to ensure that when the whole object is deleted, all its parts are deleted too.
Composition is implemented using a normal variable. If correctly implemented, the ‘part’ object will be deleted when the ‘whole’ object is deleted. Ideally, the ‘part’ object may not even be visible to clients of the ‘whole’ object.
class Email {
private Subject subject;
...
}
class Email:
def __init__(self):
self.__subject = Subject()
In this code, the Email
has a composition type relationship with the Subject
class, in the sense that the subject is part of the email.
Aggregation represents a container-contained relationship. It is a weaker relationship than composition.
SportsClub
can act as a container for Person
objects who are members of the club. Person
objects can survive without a SportsClub
object.
Implementation is similar to that of composition except the containee object can exist even after the container object is deleted.
In the code below, there is an aggregation association between the Team
class and the Person
class in that a Team
contains a Person
object who is the leader of the team.
class Team {
Person leader;
...
void setLeader(Person p) {
leader = p;
}
}
class Team:
def __init__(self):
self.__leader = None
def set_leader(self, person):
self.__leader = person
The OOP concept Inheritance allows you to define a new class based on an existing class.
For example, you can use inheritance to define an EvaluationReport
class based on an existing Report
class so that the EvaluationReport
class does not have to duplicate data/behaviors that are already implemented in the Report
class. The EvaluationReport
can inherit the wordCount
attribute and the print()
method from the base class Report
.
A superclass is said to be more general than the subclass. Conversely, a subclass is said to be more specialized than the superclass.
Applying inheritance on a group of similar classes can result in the common parts among classes being extracted into more general classes.
Man
and Woman
behave the same way for certain things. However, the two classes cannot be simply replaced with a more general class Person
because of the need to distinguish between Man
and Woman
for certain other things. A solution is to add the Person
class as a superclass (to contain the code common to men and women) and let Man
and Woman
inherit from Person
class.
Inheritance implies the derived class can be considered as a sub-type of the base class (and the base class is a super-type of the derived class), resulting in an is a relationship.
Inheritance does not necessarily mean a sub-type relationship exists. However, the two often go hand-in-hand. For simplicity, at this point let us assume inheritance implies a sub-type relationship.
To continue the previous example,
Woman
is a Person
Man
is a Person
Inheritance relationships through a chain of classes can result in inheritance hierarchies (aka inheritance trees).
Two inheritance hierarchies/trees are given below. Note that the triangle points to the parent class. Observe how the Parrot
is a Bird
as well as it is an Animal
.
Multiple Inheritance is when a class inherits directly from multiple classes. Multiple inheritance among classes is allowed in some languages (e.g., Python, C++) but not in other languages (e.g., Java, C#).
The Honey
class inherits from the Food
class and the Medicine
class because honey can be consumed as a food as well as a medicine (in some oriental medicine practices). Similarly, a Car
is a Vehicle
, an Asset
and a Liability
.
Method overriding is when a sub-class changes the behavior inherited from the parent class by re-implementing the method. Overridden methods have the same name, same type signature, and same return type.
Consider the following case of EvaluationReport
class inheriting the Report
class:
Report methods | EvaluationReport methods | Overrides? |
---|---|---|
print() | print() | Yes |
write(String) | write(String) | Yes |
read():String | read(int):String | No. Reason: the two methods have different signatures; This is a case of overloading (rather than overriding). |
Method overloading is when there are multiple methods with the same name but different type signatures. Overloading is used to indicate that multiple operations do similar things but take different parameters.
Type signature: The type signature of an operation is the type sequence of the parameters. The return type and parameter names are not part of the type signature. However, the parameter order is significant.
Example:
Method | Type Signature |
---|---|
int add(int X, int Y) | (int, int) |
void add(int A, int B) | (int, int) |
void m(int X, double Y) | (int, double) |
void m(double X, int Y) | (double, int) |
In the case below, the calculate
method is overloaded because the two methods have the same name but different type signatures (String)
and (int)
.
calculate(String): void
calculate(int): void
Well-written applications include error-handling code that allows them to recover gracefully from unexpected errors. When an error occurs, the application may need to request user intervention, or it may be able to recover on its own. In extreme cases, the application may log the user off or shut down the system. -- Microsoft
Exceptions are used to deal with 'unusual' but not entirely unexpected situations that the program might encounter at runtime.
Exception:
The term exception is shorthand for the phrase "exceptional event." An exception is an event, which occurs during the execution of a program, that disrupts the normal flow of the program's instructions. –- Java Tutorial (Oracle Inc.)
Examples:
Most languages allow code that encountered an "exceptional" situation to encapsulate details of the situation in an Exception object and throw/raise that object so that another piece of code can catch it and deal with it. This is especially useful when the code that encountered the unusual situation does not know how to deal with it.
The extract below from the -- Java Tutorial (with slight adaptations) explains how exceptions are typically handled.
When an error occurs at some point in the execution, the code being executed creates an exception object and hands it off to the runtime system. The exception object contains information about the error, including its type and the state of the program when the error occurred. Creating an exception object and handing it to the runtime system is called throwing an exception.
After a method throws an exception, the runtime system attempts to find something to handle it in the . The runtime system searches the call stack for a method that contains a block of code that can handle the exception. This block of code is called an exception handler. The search begins with the method in which the error occurred and proceeds through the call stack in the reverse order in which the methods were called. When an appropriate handler is found, the runtime system passes the exception to the handler. An exception handler is considered appropriate if the type of the exception object thrown matches the type that can be handled by the handler.
The exception handler chosen is said to catch the exception. If the runtime system exhaustively searches all the methods on the call stack without finding an appropriate exception handler, the program terminates.
Advantages of exception handling in this way:
Professional software engineers often write code using Integrated Development Environments (IDEs). IDEs support most development-related work within the same tool (hence, the term integrated).
An IDE generally consists of:
Examples of popular IDEs:
Some web-based IDEs have appeared in recent times too e.g., Amazon's Cloud9 IDE.
Some experienced developers, in particular those with a UNIX background, prefer lightweight yet powerful text editors with scripting capabilities (e.g. Emacs) over heavier IDEs.
Debugging is the process of discovering defects in the program. Here are some approaches to debugging:
Exiting process() method, x is 5.347
. This approach is not recommended due to these reasons:
In terms of timing and frequency, there are two general approaches to integration: late and one-time, early and frequent.
Late and one-time: wait till all components are completed and integrate all finished components near the end of the project.
This approach is not recommended because integration often causes many component incompatibilities (due to previous miscommunications and misunderstandings) to surface which can lead to delivery delays i.e. Late integration → incompatibilities found → major rework required → cannot meet the delivery date.
Early and frequent: integrate early and evolve each part in parallel, in small steps, re-integrating frequently.
A can be written first. This can be done by one developer, possibly the one in charge of integration. After that, all developers can flesh out the skeleton in parallel, adding one feature at a time. After each feature is done, simply integrate the new code into the main system.
Here is an animation that compares the two approaches:
Big-bang integration: integrate all components at the same time.
Big-bang is not recommended because it will uncover too many problems at the same time which could make debugging and bug-fixing more complex than when problems are uncovered incrementally.
Incremental integration: integrate a few components at a time. This approach is better than big-bang integration because it surfaces integration problems in a more manageable way.
Here is an animation that compares the two approaches:
Always code as if the person who ends up maintaining your code will be a violent psychopath who knows where you live. -- Martin Golding
Production code needs to be of high quality. Given how the world is becoming increasingly dependent on software, poor quality code is something no one can afford to tolerate.
Programs should be written and polished until they acquire publication quality. --Niklaus Wirth
Among various dimensions of code quality, such as run-time efficiency, security, and robustness, one of the most important is understandability. This is because in any non-trivial software project, code needs to be read, understood, and modified by other developers later on. Even if you do not intend to pass the code to someone else, code quality is still important because you will become a 'stranger' to your own code someday.
The two code samples given below achieve the same functionality, but one is easier to read.
Bad |
|
Good |
|
Bad
| | Good
|
Be wary when a method is longer than the computer screen, and take corrective action when it goes beyond 30 LOC (lines of code). The bigger the haystack, the harder it is to find a needle.
If you need more than 3 levels of indentation, you're screwed anyway, and should fix your program. --Linux 1.3.53 Coding Style
In particular, avoid arrowhead style code.
A real code example:
Bad |
|
Good |
|
Bad
| | Good
|
Avoid complicated expressions, especially those having many negations and nested parentheses. If you must evaluate complicated expressions, have it done in steps (i.e. calculate some intermediate values first and use them to calculate the final value).
Example:
Bad
return ((length < MAX_LENGTH) || (previousSize != length))
&& (typeCode == URGENT);
Good
boolean isWithinSizeLimit = length < MAX_LENGTH;
boolean isSameSize = previousSize != length;
boolean isValidCode = isWithinSizeLimit || isSameSize;
boolean isUrgent = typeCode == URGENT;
return isValidCode && isUrgent;
Example:
Bad
return ((length < MAX_LENGTH) or (previous_size != length)) and (type_code == URGENT)
Good
is_within_size_limit = length < MAX_LENGTH
is_same_size = previous_size != length
is_valid_code = is_within_size_limit or is_same_size
is_urgent = type_code == URGENT
return is_valid_code and is_urgent
The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague. -- Edsger Dijkstra
When the code has a number that does not explain the meaning of the number, it is called a "magic number" (as in "the number appears as if by magic"). Using a makes the code easier to understand because the name tells us more about the meaning of the number.
Example:
Bad
| | Good
|
Note: Python does not have a way to make a variable a constant. However, you can use a normal variable with an ALL_CAPS
name to simulate a constant.
Bad
| | Good
|
Similarly, you can have ‘magic’ values of other data types.
Bad
return "Error 1432"; // A magic string!
return "Error 1432" # A magic string!
In general, try to avoid any magic literals.
One essential way to improve code quality is to follow a consistent style. That is why software engineers follow a strict coding standard (aka style guide).
The aim of a coding standard is to make the entire code base look like it was written by one person. A coding standard is usually specific to a programming language and specifies guidelines such as the locations of opening and closing braces, indentation styles and naming styles (e.g. whether to use Hungarian style, Pascal casing, Camel casing, etc.). It is important that the whole team/company uses the same coding standard and that the standard is generally not inconsistent with typical industry practices. If a company's coding standard is very different from what is typically used in the industry, new recruits will take longer to get used to the company's coding style.
IDEs can help to enforce some parts of a coding standard e.g. indentation rules.
Every system is built from a domain-specific language designed by the programmers to describe that system. Functions are the verbs of that language, and classes are the nouns.
-- Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship
Use nouns for classes/variables and verbs for methods/functions.
Examples:
Name for a | Bad | Good |
---|---|---|
Class | CheckLimit | LimitChecker |
Method | result() | calculate() |
Distinguish clearly between single-valued and multi-valued variables.
Examples:
Good
Person student;
ArrayList<Person> students;
Good
name = 'Jim'
names = ['Jim', 'Alice']
Good code is its own best documentation. As you’re about to add a comment, ask yourself, ‘How can I improve the code so that this comment isn’t needed?’ Improve the code and then document it to make it even clearer. -- Steve McConnell, Author of Clean Code
Some think commenting heavily increases the 'code quality'. That is not so. Avoid writing comments to explain bad code. Improve the code to make it self-explanatory.
If the code is self-explanatory, refrain from repeating the description in a comment just for the sake of 'good documentation'.
Bad
//increment x
x++;
//trim the input
trimInput();
Bad
# increment x
x = x + 1
# trim the input
trim_input()
Do not write comments as if they are private notes to yourself. Instead, write them well enough to be understood by another programmer. One type of comment that is almost always useful is the header comment that you write for a class or an operation to explain its purpose.
Examples:
Bad Reason: this comment will only make sense to the person who wrote it
// a quick trim function used to fix bug I detected overnight
void trimInput() {
....
}
Good
/** Trims the input of leading and trailing spaces */
void trimInput() {
....
}
Bad Reason: this comment will only make sense to the person who wrote it
def trim_input():
"""a quick trim function used to fix bug I detected overnight"""
...
Good
def trim_input():
"""Trim the input of leading and trailing spaces"""
...
The first version of the code you write may not be of production quality. It is OK to first concentrate on making the code work, rather than worry over the quality of the code, as long as you improve the quality later. This process of improving a program's internal structure in small steps without modifying its external behavior is called refactoring.
Improving code structure can have many secondary benefits: e.g.
Given below are two common refactorings (more).
Refactoring Name: Consolidate Duplicate Conditional Fragments
Situation: The same fragment of code is in all branches of a conditional expression.
Method: Move it outside of the expression.
Example:
| → |
|
| → |
|
Refactoring Name: Extract Method
Situation: You have a code fragment that can be grouped together.
Method: Turn the fragment into a method whose name explains the purpose of the method.
Example:
void printOwing() {
printBanner();
// print details
System.out.println("name: " + name);
System.out.println("amount " + getOutstanding());
}
void printOwing() {
printBanner();
printDetails(getOutstanding());
}
void printDetails(double outstanding) {
System.out.println("name: " + name);
System.out.println("amount " + outstanding);
}
def print_owing():
print_banner()
# print details
print("name: " + name)
print("amount " + get_outstanding())
def print_owing():
print_banner()
print_details(get_outstanding())
def print_details(amount):
print("name: " + name)
print("amount " + amount)
Some IDEs have builtin support for basic refactorings such as automatically renaming a variable/method/class in all places it has been used.
Refactoring, even if done with the aid of an IDE, may still result in regressions. Therefore, each small refactoring should be followed by regression testing.
Reuse is a major theme in software engineering practices. By reusing tried-and-tested components, the robustness of a new software system can be enhanced while reducing the manpower and time requirement. Reusable components come in many forms; it can be reusing a piece of code, a subsystem, or a whole software.
While you may be tempted to use many libraries/frameworks/platforms that seem to crop up on a regular basis and promise to bring great benefits, note that there are costs associated with reuse. Here are some:
An Application Programming Interface (API) specifies the interface through which other programs can interact with a software component. It is a contract between the component and its clients.
A class has an API (e.g., API of the Java String
class, API of the Python str
class) which is a collection of public methods that you can invoke to make use of the class.
The GitHub API is a collection of web request formats that the GitHub server accepts and their corresponding responses. You can write a program that interacts with GitHub through that API.
When developing large systems, if you define the API of each component early, the development team can develop the components in parallel because the future behavior of the other components are now more predictable.
A library is a collection of modular code that is general and can be used by other programs.
Java classes you get with the JDK (such as String
, ArrayList
, HashMap
, etc.) are library classes that are provided in the default Java distribution.
Natty is a Java library that can be used for parsing strings that represent dates e.g. The 31st of April in the year 2008
built-in modules you get with Python (such as csv
, random
, sys
, etc.) are libraries that are provided in the default Python distribution. Classes such as list
, str
, dict
are built-in library classes that you get with Python.
Colorama is a Python library that can be used for colorizing text in a CLI.
These are the typical steps required to use a library:
The overall structure and execution flow of a specific category of software systems can be very similar. The similarity is an opportunity to reuse at a high scale.
Running example:
IDEs for different programming languages are similar in how they support editing code, organizing project files, debugging, etc.
A software framework is a reusable implementation of a software (or part thereof) providing generic functionality that can be selectively customized to produce a specific application.
Running example:
Eclipse is an IDE framework that can be used to create IDEs for different programming languages.
Some frameworks provide a complete implementation of a default behavior which makes them immediately usable.
Running example:
Eclipse is a fully functional Java IDE out-of-the-box.
A framework facilitates the adaptation and customization of some desired functionality.
Running example:
The Eclipse plugin system can be used to create an IDE for different programming languages while reusing most of the existing IDE features of Eclipse.
E.g. https://marketplace.eclipse.org/content/pydev-python-ide-eclipse
Some frameworks cover only a specific component or an aspect.
JavaFX is a framework for creating Java GUIs. Tkinter is a GUI framework for Python.
More examples of frameworks
Although both frameworks and libraries are reuse mechanisms, there are notable differences:
Libraries are meant to be used ‘as is’ while frameworks are meant to be customized/extended. e.g., writing plugins for Eclipse so that it can be used as an IDE for different languages (C++, PHP, etc.), adding modules and themes to Drupal, and adding test cases to JUnit.
Your code calls the library code while the framework code calls your code. Frameworks use a technique called inversion of control, aka the “Hollywood principle” (i.e. don’t call us, we’ll call you!). That is, you write code that will be called by the framework, e.g. writing test methods that will be called by the JUnit framework. In the case of libraries, your code calls libraries.
A platform provides a runtime environment for applications. A platform is often bundled with various libraries, tools, frameworks, and technologies in addition to a runtime environment but the defining characteristic of a software platform is the presence of a runtime environment.
Technically, an operating system can be called a platform. For example, Windows PC is a platform for desktop applications while iOS is a platform for mobile applications.
Two well-known examples of platforms are JavaEE and .NET, both of which sit above the operating systems layer, and are used to develop enterprise applications. Infrastructure services such as connection pooling, load balancing, remote code execution, transaction management, authentication, security, messaging etc. are done similarly in most enterprise applications. Both JavaEE and .NET provide these services to applications in a customizable way without developers having to implement them from scratch every time.
Cloud computing is the delivery of computing as a service over the network, rather than a product running on a local machine. This means the actual hardware and software is located at a remote location, typically, at a large server farm, while users access them over the network. Maintenance of the hardware and software is managed by the cloud provider while users typically pay for only the amount of services they use. This model is similar to the consumption of electricity; the power company manages the power plant, while the consumers pay them only for the electricity used. The cloud computing model optimizes hardware and software utilization and reduces the cost to consumers. Furthermore, users can scale up/down their utilization at will without having to upgrade their hardware and software. The traditional non-cloud model of computing is similar to everyone buying their own generators to create electricity for their own use.
source: https://commons.wikimedia.org
Cloud computing can deliver computing services at three levels:
Infrastructure as a service (IaaS) delivers computer infrastructure as a service. For example, a user can deploy virtual servers on the cloud instead of buying physical hardware and installing server software on them. Another example would be a customer using storage space on the cloud for off-site storage of data. Rackspace is an example of an IaaS cloud provider. Amazon Elastic Compute Cloud (Amazon EC2) is another one.
Platform as a service (PaaS) provides a platform on which developers can build applications. Developers do not have to worry about infrastructure issues such as deploying servers or load balancing as is required when using IaaS. Those aspects are automatically taken care of by the platform. The price to pay is reduced flexibility; applications written on PaaS are limited to facilities provided by the platform. A PaaS example is the Google App Engine where developers can build applications using Java, Python, PHP, or Go whereas Amazon EC2 allows users to deploy applications written in any language on their virtual servers.
Software as a service (SaaS) allows applications to be accessed over the network instead of installing them on a local machine. For example, Google Docs is a SaaS word processing software, while Microsoft Word is a traditional word processing software.
Developer-to-developer documentation can be in one of two forms:
Another view proposed by Daniele Procida in this article is as follows:
There is a secret that needs to be understood in order to write good software documentation: there isn’t one thing called documentation, there are four. They are: tutorials, how-to guides, explanation and technical reference. They represent four different purposes or functions, and require four different approaches to their creation. Understanding the implications of this will help improve most software documentation - often immensely. ...
TUTORIALS
A tutorial:
- is learning-oriented
- allows the newcomer to get started
- is a lesson
Analogy: teaching a small child how to cook
HOW-TO GUIDES
A how-to guide:
- is goal-oriented
- shows how to solve a specific problem
- is a series of steps
Analogy: a recipe in a cookery book
EXPLANATION
An explanation:
- is understanding-oriented
- explains
- provides background and context
Analogy: an article on culinary social history
REFERENCE
A reference guide:
- is information-oriented
- describes the machinery
- is accurate and complete
Analogy: a reference encyclopedia article
Software documentation (applies to both user-facing and developer-facing) is best kept in a text format for ease of version tracking. A writer-friendly source format is also desirable as non-programmers (e.g., technical writers) may need to author/edit such documents. As a result, formats such as Markdown, AsciiDoc, and PlantUML are often used for software documentation.
When writing project documents, a top-down breadth-first explanation is easier to understand than a bottom-up one.
The main advantage of the top-down approach is that the document is structured like an upside down tree (root at the top) and the reader can travel down a path she is interested in until she reaches the component she is interested to learn in-depth, without having to read the entire document or understand the whole system.
To explain a system called SystemFoo
with two sub-systems, FrontEnd
and BackEnd
, start by describing the system at the highest level of abstraction, and progressively drill down to lower level details. An outline for such a description is given below.
[First, explain what the system is, in a black-box fashion (no internal details, only the external view).]
SystemFoo
is a ....
[Next, explain the high-level architecture of SystemFoo
, referring to its major components only.]
SystemFoo
consists of two major components:FrontEnd
andBackEnd
.
The job ofFrontEnd
is to ... while the job ofBackEnd
is to ...
And this is howFrontEnd
andBackEnd
work together ...
[Now you can drill down to FrontEnd
's details.]
FrontEnd
consists of three major components:A
,B
,C
A
's job is to ...B
's job is to...C
's job is to...
And this is how the three components work together ...
[At this point, further drill down to the internal workings of each component. A reader who is not interested in knowing the nitty-gritty details can skip ahead to the section on BackEnd
.]
In-depth description of
A
In-depth description ofB
...
[At this point drill down to the details of the BackEnd
.]
...
Technical documents exist to help others understand technical details. Therefore, it is not enough for the documentation to be accurate and comprehensive; it should also be comprehensible.
Here are some tips on writing effective documentation.
Aim for 'just enough' developer documentation.
Anything that is already clear in the code need not be described in words. Instead, focus on providing higher level information that is not readily visible in the code or comments.
Refrain from duplicating chunks of text. When describing several similar algorithms/designs/APIs, etc., do not simply duplicate large chunks of text. Instead, describe the similarities in one place and emphasize only the differences in other places. It is very annoying to see pages and pages of similar text without any indication as to how they differ from each other.
Testing: Operating a system or component under specified conditions, observing or recording the results, and making an evaluation of some aspect of the system or component. –- source: IEEE
When testing, you execute a set of test cases. A test case specifies how to perform a test. At a minimum, it specifies the input to the software under test (SUT) and the expected behavior.
Example: A minimal test case for testing a browser:
longfile.html
located in the test data
folder.longfile.html
.Test cases can be determined based on the specification, reviewing similar existing systems, or comparing to the past behavior of the SUT.
For each test case you should do the following:
A test case failure is a mismatch between the expected behavior and the actual behavior. A failure indicates a potential defect (or a bug), unless the error is in the test case itself.
Example: In the browser example above, a test case failure is implied if the scrollbar remains disabled after loading longfile.html
. The defect/bug causing that failure could be an uninitialized variable.
When you modify a system, the modification may result in some unintended and undesirable effects on the system. Such an effect is called a regression.
Regression testing is the re-testing of the software to detect regressions. Note that to detect regressions, you need to retest all related components, even if they had been tested before.
Regression testing is more effective when it is done frequently, after each small change. However, doing so can be prohibitively expensive if testing is done manually. Hence, regression testing is more practical when it is automated.
Unit testing: testing individual units (methods, classes, subsystems, ...) to ensure each piece works correctly.
In OOP code, it is common to write one or more unit tests for each public method of a class.
Here are the code skeletons for a Foo
class containing two methods and a FooTest
class that contains unit tests for those two methods.
class Foo {
String read() {
// ...
}
void write(String input) {
// ...
}
}
class FooTest {
@Test
void read() {
// a unit test for Foo#read() method
}
@Test
void write_emptyInput_exceptionThrown() {
// a unit tests for Foo#write(String) method
}
@Test
void write_normalInput_writtenCorrectly() {
// another unit tests for Foo#write(String) method
}
}
import unittest
class Foo:
def read(self):
# ...
def write(self, input):
# ...
class FooTest(unittest.TestCase):
def test_read(self):
# a unit test for read() method
def test_write_emptyIntput_ignored(self):
# a unit test for write(string) method
def test_write_normalInput_writtenCorrectly(self):
# another unit test for write(string) method
Integration testing : testing whether different parts of the software work together (i.e. integrates) as expected. Integration tests aim to discover bugs in the 'glue code' related to how components interact with each other. These bugs are often the result of misunderstanding what the parts are supposed to do vs what the parts are actually doing.
Suppose a class Car
uses classes Engine
and Wheel
. If the Car
class assumed a Wheel
can support a speed of up to 200 mph but the actual Wheel
can only support a speed of up to 150 mph, it is the integration test that is supposed to uncover this discrepancy.
System testing: take the whole system and test it against the system specification.
System testing is typically done by a testing team (also called a QA team).
System test cases are based on the specified external behavior of the system. Sometimes, system tests go beyond the bounds defined in the specification. This is useful when testing that the system fails 'gracefully' when pushed beyond its limits.
Suppose the SUT is a browser that is supposedly capable of handling web pages containing up to 5000 characters. Given below is a test case to test if the SUT fails gracefully if pushed beyond its limits.
Test case: load a web page that is too big
* Input: loads a web page containing more than 5000 characters.
* Expected behavior: aborts the loading of the page
and shows a meaningful error message.
This test case would fail if the browser attempted to load the large file anyway and crashed.
System testing includes testing against non-functional requirements too. Here are some examples:
Alpha testing is performed by the users, under controlled conditions set by the software development team.
Beta testing is performed by a selected subset of target users of the system in their natural work setting.
An open beta release is the release of not-yet-production-quality-but-almost-there software to the general population. For example, Google’s Gmail was in 'beta' for many years before the label was finally removed.
Here are two alternative approaches to testing a software: Scripted testing and Exploratory testing.
Scripted testing: First write a set of test cases based on the expected behavior of the SUT, and then perform testing based on that set of test cases.
Exploratory testing: Devise test cases on-the-fly, creating new test cases based on the results of the past test cases.
Exploratory testing is ‘the simultaneous learning, test design, and test execution’ [source: bach-et-explained] whereby the nature of the follow-up test case is decided based on the behavior of the previous test cases. In other words, running the system and trying out various operations. It is called exploratory testing because testing is driven by observations during testing. Exploratory testing usually starts with areas identified as error-prone, based on the tester’s past experience with similar systems. One tends to conduct more tests for those operations where more faults are found.
Here is an example thought process behind a segment of an exploratory testing session:
“Hmm... looks like feature x is broken. This usually means feature n and k could be broken too; you need to look at them soon. But before that, you should give a good test run to feature y because users can still use the product if feature y works, even if x doesn’t work. Now, if feature y doesn’t work 100%, you have a major problem and this has to be made known to the development team sooner rather than later...”
Exploratory testing is also known as reactive testing, error guessing technique, attack-based testing, and bug hunting.
Acceptance testing (aka User Acceptance Testing (UAT): test the system to ensure it meets the user requirements.
Acceptance tests give an assurance to the customer that the system does what it is intended to do. Acceptance test cases are often defined at the beginning of the project, usually based on the use case specification. Successful completion of UAT is often a prerequisite to the project sign-off.
Acceptance testing comes after system testing. Similar to system testing, acceptance testing involves testing the whole system.
Some differences between system testing and acceptance testing:
System Testing | Acceptance Testing |
---|---|
Done against the system specification | Done against the requirements specification |
Done by testers of the project team | Done by a team that represents the customer |
Done on the development environment or a test bed | Done on the deployment site or on a close simulation of the deployment site |
Both negative and positive test cases | More focus on positive test cases |
Note: negative test cases: cases where the SUT is not expected to work normally e.g. incorrect inputs; positive test cases: cases where the SUT is expected to work normally
Requirement specification versus system specification
The requirement specification need not be the same as the system specification. Some example differences:
Requirements specification | System specification |
---|---|
limited to how the system behaves in normal working conditions | can also include details on how it will fail gracefully when pushed beyond limits, how to recover, etc. specification |
written in terms of problems that need to be solved (e.g. provide a method to locate an email quickly) | written in terms of how the system solves those problems (e.g. explain the email search feature) |
specifies the interface available for intended end-users | could contain additional APIs not available for end-users (for the use of developers/testers) |
However, in many cases one document serves as both a requirement specification and a system specification.
Passing system tests does not necessarily mean passing acceptance testing. Some examples:
An automated test case can be run programmatically and the result of the test case (pass or fail) is determined programmatically. Compared to manual testing, automated testing reduces the effort required to run tests repeatedly and increases precision of testing (because manual testing is susceptible to human errors).
A simple way to semi-automate testing of a CLI (Command Line Interface) app is by using input/output re-direction.
Let's assume you are testing a CLI app called AddressBook
. Here are the detailed steps:
Store the test input in the text file input.txt
.
Store the output you expect from the SUT in another text file expected.txt
.
Run the program as given below, which will redirect the text in input.txt
as the input to AddressBook
and similarly, will redirect the output of AddressBook to a text file output.txt
. Note that this does not require any code changes to AddressBook
.
java AddressBook < input.txt > output.txt
The way to run a CLI program differs based on the language.
e.g., In Python, assuming the code is in AddressBook.py
file, use the command
python AddressBook.py < input.txt > output.txt
If you are using Windows, use a normal command window to run the app, not a PowerShell window.
Next, you compare output.txt
with the expected.txt
. This can be done using a utility such as Windows' FC
(i.e. File Compare) command, Unix's diff
command, or a GUI tool such as WinMerge.
FC output.txt expected.txt
Note that the above technique is only suitable when testing CLI apps, and only if the exact output can be predetermined. If the output varies from one run to the other (e.g. it contains a time stamp), this technique will not work. In those cases, you need more sophisticated ways of automating tests.
Except for trivial , is not practical because such testing often requires a massive/infinite number of test cases.
Consider the test cases for adding a string object to a :
Exhaustive testing of this operation can take many more test cases.
Program testing can be used to show the presence of bugs, but never to show their absence!
--Edsger Dijkstra
Every test case adds to the cost of testing. In some systems, a single test case can cost thousands of dollars e.g. on-field testing of flight-control software. Therefore, test cases need to be designed to make the best use of testing resources. In particular:
Testing should be effective i.e., it finds a high percentage of existing bugs e.g., a set of test cases that finds 60 defects is more effective than a set that finds only 30 defects in the same system.
Testing should be efficient i.e., it has a high rate of success (bugs found/test cases) a set of 20 test cases that finds 8 defects is more efficient than another set of 40 test cases that finds the same 8 defects.
For testing to be , each new test you add should be targeting a potential fault that is not already targeted by existing test cases. There are test case design techniques that can help us improve the E&E of testing.
Test case design can be of three types, based on how much of the SUT's internal details are considered when designing test cases:
Black-box (aka specification-based or responsibility-based) approach: test cases are designed exclusively based on the SUT’s specified external behavior.
White-box (aka glass-box or structured or implementation-based) approach: test cases are designed based on what is known about the SUT’s implementation, i.e. the code.
Gray-box approach: test case design uses some important information about the implementation. For example, if the implementation of a sort operation uses different algorithms to sort lists shorter than 1000 items and lists longer than 1000 items, more meaningful test cases can then be added to verify the correctness of both algorithms.
Consider the testing of the following operation.
isValidMonth(m)
: returns true
if m
(and int
) is in the range [1..12]
It is inefficient and impractical to test this method for all integer values [-MIN_INT to MAX_INT]
. Fortunately, there is no need to test all possible input values. For example, if the input value 233
fails to produce the correct result, the input 234
is likely to fail too; there is no need to test both.
In general, most SUTs do not treat each input in a unique way. Instead, they process all possible inputs in a small number of distinct ways. That means a range of inputs is treated the same way inside the SUT. Equivalence partitioning (EP) is a test case design technique that uses the above observation to improve the E&E of testing.
Equivalence partition (aka equivalence class): A group of test inputs that are likely to be processed by the SUT in the same way.
By dividing possible inputs into equivalence partitions you can,
Equivalence partitions (EPs) are usually derived from the specifications of the SUT.
These could be EPs for the isValidMonth example:
true
(produces false
)true
true
(produces false
)When the SUT has multiple inputs, you should identify EPs for each input.
Consider the method duplicate(String s, int n): String
which returns a String
that contains s
repeated n
times.
Example EPs for s
:
Example EPs for n
:
0
An EP may not have adjacent values.
Consider the method isPrime(int i): boolean
that returns true if i
is a prime number.
EPs for i
:
Some inputs have only a small number of possible values and a potentially unique behavior for each value. In those cases, you have to consider each value as a partition by itself.
Consider the method showStatusMessage(GameStatus s): String
that returns a unique String
for each of the possible values of s (GameStatus
is an enum
). In this case, each possible value of s
will have to be considered as a partition.
Note that the EP technique is merely a heuristic and not an exact science, especially when applied manually (as opposed to using an automated program analysis tool to derive EPs). The partitions derived depend on how one ‘speculates’ the SUT to behave internally. Applying EP under a glass-box or gray-box approach can yield more precise partitions.
Consider the EPs given above for the method isValidMonth
. A different tester might use these EPs instead:
true
false
Some more examples:
Specification | Equivalence partitions |
---|---|
| [ |
| [ |
When deciding EPs of OOP methods, you need to identify the EPs of all data participants that can potentially influence the behaviour of the method, such as,
Consider this method in the DataStack
class:
push(Object o): boolean
o
to the top of the stack if the stack is not full.true
if the push operation was a success.MutabilityException
if the global flag FREEZE==true
.InvalidValueException
if o
is null.EPs:
DataStack
object: [full] [not full]o
: [null] [not null]FREEZE
: [true][false] Consider a simple Minesweeper app. What are the EPs for the newGame()
method of the Logic
component?
As newGame()
does not have any parameters, the only obvious participant is the Logic
object itself.
Note that if the glass-box or the grey-box approach is used, other associated objects that are involved in the method might also be included as participants. For example, the Minefield
object can be considered as another participant of the newGame()
method. Here, the black-box approach is assumed.
Next, let us identify equivalence partitions for each participant. Will the newGame()
method behave differently for different Logic
objects? If yes, how will it differ? In this case, yes, it might behave differently based on the game state. Therefore, the equivalence partitions are:
PRE_GAME
: before the game starts, minefield does not exist yetREADY
: a new minefield has been created and the app is waiting for the player’s first moveIN_PLAY
: the current minefield is already in useWON
, LOST
: let us assume that newGame()
behaves the same way for these two values Consider the Logic
component of the Minesweeper application. What are the EPs for the markCellAt(int x, int y)
method? The partitions in bold represent valid inputs.
Logic
: PRE_GAME, READY, IN_PLAY, WON, LOSTx
: [MIN_INT..-1] [0..(W-1)] [W..MAX_INT] (assuming a minefield size of WxH)y
: [MIN_INT..-1] [0..(H-1)] [H..MAX_INT]Cell
at (x,y)
: HIDDEN, MARKED, CLEAREDBoundary Value Analysis (BVA) is a test case design heuristic that is based on the observation that bugs often result from incorrect handling of boundaries of equivalence partitions. This is not surprising, as the end points of boundaries are often used in branching instructions, etc., where the programmer can make mistakes.
The markCellAt(int x, int y)
operation could contain code such as if (x > 0 && x <= (W-1))
which involves the boundaries of x’s equivalence partitions.
BVA suggests that when picking test inputs from an equivalence partition, values near boundaries (i.e. boundary values) are more likely to find bugs.
Boundary values are sometimes called corner cases.
Typically, you should choose three values around the boundary to test: one value from the boundary, one value just below the boundary, and one value just above the boundary. The number of values to pick depends on other factors, such as the cost of each test case.
Some examples:
Equivalence partition | Some possible test values (boundaries are in bold) |
---|---|
[1-12] | 0,1,2, 11,12,13 |
[MIN_INT, 0] | MIN_INT, MIN_INT+1, -1, 0 , 1 |
[any non-null String] | Empty String, a String of maximum possible length |
[prime numbers] | No specific boundary |
[non-empty Stack] | Stack with: no elements, one element, two elements, no empty spaces, only one empty space |
Code review is the systematic examination of code with the intention of finding where the code can be improved.
Reviews can be done in various forms. Some examples below:
Pull Request reviews
In pair programming
Formal inspections
Inspections involve a group of people systematically examining project artifacts to discover defects. Members of the inspection team play various roles during the process, such as:
Advantages of code review over testing:
Disadvantages:
Static analysis: Static analysis is the analysis of code without actually executing the code.
Static analysis of code can find useful information such as unused variables, unhandled exceptions, style errors, and statistics. Most modern IDEs come with some inbuilt static analysis capabilities. For example, an IDE can highlight unused variables as you type the code into the editor.
The term static in static analysis refers to the fact that the code is analyzed without executing the code. In contrast, dynamic analysis requires the code to be executed to gather additional information about the code e.g., performance characteristics.
Higher-end static analysis tools (static analyzers) can perform more complex analysis such as locating potential bugs, memory leaks, inefficient code structures, etc.
Some example static analyzers for Java: CheckStyle, PMD, FindBugs
Linters are a subset of static analyzers that specifically aim to locate areas where the code can be made 'cleaner'.
Formal verification uses mathematical techniques to prove the correctness of a program.
Advantages:
Disadvantages:
A Work Breakdown Structure (WBS) depicts information about tasks and their details in terms of subtasks. When managing projects, it is useful to divide the total work into smaller, well-defined units. Relatively complex tasks can be further split into subtasks. In complex projects, a WBS can also include prerequisite tasks and effort estimates for each task.
The high level tasks for a single iteration of a small project could look like the following:
Task ID | Task | Estimated Effort | Prerequisite Task |
---|---|---|---|
A | Analysis | 1 man day | - |
B | Design | 2 man day | A |
C | Implementation | 4.5 man day | B |
D | Testing | 1 man day | C |
E | Planning for next version | 1 man day | D |
The effort is traditionally measured in man hour/day/month i.e. work that can be done by one person in one hour/day/month. The Task ID is a label for easy reference to a task. Simple labeling is suitable for a small project, while a more informative labeling system can be adopted for bigger projects.
An example WBS for a game development project.
Task ID | Task | Estimated Effort | Prerequisite Task |
---|---|---|---|
A | High level design | 1 man day | - |
B |
Detail design
|
2 man day
| A |
C |
Implementation
|
4.5 man day
|
|
D | System Testing | 1 man day | C |
E | Planning for next version | 1 man day | D |
All tasks should be well-defined. In particular, it should be clear as to when the task will be considered done.
Some examples of ill-defined tasks and their better-defined counterparts:
Bad | Better |
---|---|
more coding | implement component X |
do research on UI testing | find a suitable tool for testing the UI |
A milestone is the end of a stage which indicates significant progress. You should take into account dependencies and priorities when deciding on the features to be delivered at a certain milestone.
Each intermediate product release is a milestone.
In some projects, it is not practical to have a very detailed plan for the whole project due to the uncertainty and unavailability of required information. In such cases, you can use a high-level plan for the whole project and a detailed plan for the next few milestones.
Milestones for the Minesweeper project, iteration 1
Day | Milestones |
---|---|
Day 1 | Architecture skeleton completed |
Day 3 | ‘new game’ feature implemented |
Day 4 | ‘new game’ feature tested |
A buffer is time set aside to absorb any unforeseen delays. It is very important to include buffers in a software project schedule because effort/time estimations for software development are notoriously hard. However, do not inflate task estimates to create hidden buffers; have explicit buffers instead. Reason: With explicit buffers, it is easier to detect incorrect effort estimates which can serve as feedback to improve future effort estimates.
Keeping track of project tasks (who is doing what, which tasks are ongoing, which tasks are done etc.) is an essential part of project management. In small projects, it may be possible to keep track of tasks using simple tools such as online spreadsheets or general-purpose/light-weight task tracking tools such as Trello. Bigger projects need more sophisticated task tracking tools.
Issue trackers (sometimes called bug trackers) are commonly used to track task assignment and progress. Most online project management software such as GitHub, SourceForge, and BitBucket come with an integrated issue tracker.
A screenshot from the Jira Issue tracker software (Jira is part of the BitBucket project management tool suite):
Given below are three commonly used team structures in software development. Irrespective of the team structure, it is a good practice to assign roles and responsibilities to different team members so that someone is clearly in charge of each aspect of the project. In comparison, the ‘everybody is responsible for everything’ approach can result in more chaos and hence slower progress.
Egoless team
In this structure, every team member is equal in terms of responsibility and accountability. When any decision is required, consensus must be reached. This team structure is also known as a democratic team structure. This team structure usually finds a good solution to a relatively hard problem as all team members contribute ideas.
However, the democratic nature of the team structure bears a higher risk of falling apart due to the absence of an authority figure to manage the team and resolve conflicts.
Chief programmer team
Frederick Brooks proposed that software engineers learn from the medical surgical team in an operating room. In such a team, there is always a chief surgeon, assisted by experts in other areas. Similarly, in a chief programmer team structure, there is a single authoritative figure, the chief programmer. Major decisions, e.g. system architecture, are made solely by him/her and obeyed by all other team members. The chief programmer directs and coordinates the effort of other team members. When necessary, the chief will be assisted by domain specialists e.g. business specialists, database experts, network technology experts, etc. This allows individual group members to concentrate solely on the areas in which they have sound knowledge and expertise.
The success of such a team structure relies heavily on the chief programmer. Not only must he/she be a superb technical hand, he/she also needs good managerial skills. Under a suitably qualified leader, such a team structure is known to produce successful work.
Strict hierarchy team
At the opposite extreme of an egoless team, a strict hierarchy team has a strictly defined organization among the team members, reminiscent of the military or a bureaucratic government. Each team member only works on his/her assigned tasks and reports to a single “boss”.
In a large, resource-intensive, complex project, this could be a good team structure to reduce communication overhead.
Software development goes through different stages such as requirements, analysis, design, implementation and testing. These stages are collectively known as the software development life cycle (SDLC). There are several approaches, known as software development life cycle models (also called software process models), that describe different ways to go through the SDLC. Each process model prescribes a "roadmap" for the software developers to manage the development effort. The roadmap describes the aims of the development stage(s), the artifacts or outcome of each stage, as well as the workflow i.e. the relationship between stages.
The sequential model, also called the waterfall model, models software development as a linear process, in which the project is seen as progressing steadily in one direction through the development stages. The name waterfall stems from how the model is drawn to look like a waterfall (see below).
When one stage of the process is completed, it should produce some artifacts to be used in the next stage. For example, upon completion of the requirements stage, a comprehensive list of requirements is produced that will see no further modifications. A strict application of the sequential model would require each stage to be completed before starting the next.
This could be a useful model when the problem statement is well-understood and stable. In such cases, using the sequential model should result in a timely and systematic development effort, provided that all goes well. As each stage has a well-defined outcome, the progress of the project can be tracked with relative ease.
The major problem with this model is that the requirements of a real-world project are rarely well-understood at the beginning and keep changing over time. One reason for this is that users are generally not aware of how a software application can be used without prior experience in using a similar application.
The iterative model (sometimes called iterative and incremental) advocates having several iterations of SDLC. Each of the iterations could potentially go through all the development stages, from requirements gathering to testing & deployment. Roughly, it appears to be similar to several cycles of the sequential model.
In this model, each of the iterations produces a new version of the product. Feedback on the new version can then be fed to the next iteration. Taking the Minesweeper game as an example, the iterative model will deliver a fully playable version from the early iterations. However, the first iteration will have primitive functionality, for example, a clumsy text based UI, fixed board size, limited randomization, etc. These functionalities will then be improved in later releases.
The iterative model can take a breadth-first or a depth-first approach to iteration planning.
Most projects use a mixture of breadth-first and depth-first iterations i.e., an iteration can contain some breadth-first work as well as some depth-first work.
In 2001, a group of prominent software engineering practitioners met and brainstormed for an alternative to documentation-driven, heavyweight software development processes that were used in most large projects at the time. This resulted in something called the agile manifesto (a vision statement of what they were looking to do).
You are uncovering better ways of developing software by doing it and helping others do it.
Through this work you have come to value:
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
That is, while there is value in the items on the right, you value the items on the left more.
-- Extract from the Agile Manifesto
Subsequently, some of the signatories of the manifesto went on to create process models that try to follow it. These processes are collectively called agile processes. Some of the key features of agile approaches are:
There are a number of agile processes in the development world today. eXtreme Programming (XP) and Scrum are two of the well-known ones.
The following description was adapted from the XP home page, emphasis added:
Extreme Programming (XP) stresses customer satisfaction. Instead of delivering everything you could possibly want on some date far in the future, this process delivers the software you need as you need it.
XP aims to empower developers to confidently respond to changing customer requirements, even late in the life cycle.
XP emphasizes teamwork. Managers, customers, and developers are all equal partners in a collaborative team. XP implements a simple, yet effective environment enabling teams to become highly productive. The team self-organizes around the problem to solve it as efficiently as possible.
XP aims to improve a software project in five essential ways: communication, simplicity, feedback, respect, and courage. Extreme Programmers constantly communicate with their customers and fellow programmers. They keep their design simple and clean. They get feedback by testing their software starting on day one. Every small success deepens their respect for the unique contributions of each and every team member. With this foundation, Extreme Programmers are able to courageously respond to changing requirements and technology.
XP has a set of simple rules. XP is a lot like a jig saw puzzle with many small pieces. Individually the pieces make no sense, but when combined together a complete picture can be seen. This flow chart shows how Extreme Programming's rules work together.
Pair programming, CRC cards, project velocity, and standup meetings are some interesting topics related to XP. Refer to extremeprogramming.org to find out more about XP.
This description of Scrum was adapted from Wikipedia [retrieved on 18/10/2011], emphasis added:
Scrum is a process skeleton that contains sets of practices and predefined roles. The main roles in Scrum are:
A Scrum project is divided into iterations called Sprints. A sprint is the basic unit of development in Scrum. Sprints tend to last between one week and one month, and are a timeboxed (i.e. restricted to a specific duration) effort of a constant length.
Each sprint is preceded by a planning meeting, where the tasks for the sprint are identified and an estimated commitment for the sprint goal is made, and followed by a review or retrospective meeting, where the progress is reviewed and lessons for the next sprint are identified.
During each sprint, the team creates a potentially deliverable product increment (for example, working and tested software). The set of features that go into a sprint come from the product backlog, which is a prioritized set of high level requirements of work to be done. Which backlog items go into the sprint is determined during the sprint planning meeting. During this meeting, the Product Owner informs the team of the items in the product backlog that he or she wants completed. The team then determines how much of this they can commit to complete during the next sprint, and records this in the sprint backlog. During a sprint, no one is allowed to change the sprint backlog, which means that the requirements are frozen for that sprint. Development is timeboxed such that the sprint must end on time; if requirements are not completed for any reason they are left out and returned to the product backlog. After a sprint is completed, the team demonstrates the use of the software.
Scrum enables the creation of self-organizing teams by encouraging co-location of all team members, and verbal communication between all team members and disciplines in the project.
A key principle of Scrum is its recognition that during a project the customers can change their minds about what they want and need (often called requirements churn), and that unpredicted challenges cannot be easily addressed in a traditional predictive or planned manner. As such, Scrum adopts an empirical approach—accepting that the problem cannot be fully understood or defined, focusing instead on maximizing the team’s ability to deliver quickly and respond to emerging requirements.
Daily Scrum is another key scrum practice. The description below was adapted from https://www.mountaingoatsoftware.com (emphasis added):
In Scrum, on each day of a sprint, the team holds a daily scrum meeting called the "daily scrum.” Meetings are typically held in the same location and at the same time each day. Ideally, a daily scrum meeting is held in the morning, as it helps set the context for the coming day's work. These scrum meetings are strictly time-boxed to 15 minutes. This keeps the discussion brisk but relevant.
...
During the daily scrum, each team member answers the following three questions:
...
The daily scrum meeting is not used as a problem-solving or issue resolution meeting. Issues that are raised are taken offline and usually dealt with by the relevant subgroup immediately after the meeting.
Revision control is the process of managing multiple versions of a piece of information. In its simplest form, this is something that many people do by hand: every time you modify a file, save it under a new name that contains a number, each one higher than the number of the preceding version.
Manually managing multiple versions of even a single file is an error-prone task, though, so software tools to help automate this process have long been available. The earliest automated revision control tools were intended to help a single user to manage revisions of a single file. Over the past few decades, the scope of revision control tools has expanded greatly; they now manage multiple files, and help multiple people to work together. The best modern revision control tools have no problem coping with thousands of people working together on projects that consist of hundreds of thousands of files.
Revision control software will track the history and evolution of your project, so you don't have to. For every change, you'll have a log of who made it; why they made it; when they made it; and what the change was.
Revision control software makes it easier for you to collaborate when you're working with other people. For example, when people more or less simultaneously make potentially incompatible changes, the software will help you to identify and resolve those conflicts.
It can help you to recover from mistakes. If you make a change that later turns out to be an error, you can revert to an earlier version of one or more files. In fact, a really good revision control tool will even help you to efficiently figure out exactly when a problem was introduced.
It will help you to work simultaneously on, and manage the drift between, multiple versions of your project. Most of these reasons are equally valid, at least in theory, whether you're working on a project by yourself, or with a hundred other people.
-- [adapted from bryan-mercurial-guide]
RCS: Revision control software are the software tools that automate the process of Revision Control i.e. managing revisions of software artifacts.
Revision: A revision (some seem to use it interchangeably with version while others seem to distinguish the two -- here, let us treat them as the same, for simplicity) is a state of a piece of information at a specific time that is a result of some changes to it e.g., if you modify the code and save the file, you have a new revision (or a version) of that file.
Revision control software are also known as Version Control Software (VCS), and by a few other names.
The basic UML notations used to represent a class:
A Table
class shown in UML notation:
The 'Operations' compartment and/or the 'Attributes' compartment may be omitted if such details are not important for the task at hand. Similarly, some attributes/operations can be omitted if not relevant. 'Attributes' always appear above the 'Operations' compartment. All operations should be in one compartment rather than each operation in a separate compartment. Same goes for attributes.
The visibility of attributes and operations is used to indicate the level of access allowed for each attribute or operation. The types of visibility and their exact meanings depend on the programming language used. Here are some common visibilities and how they are indicated in a class diagram:
+
: public-
: private#
: protected~
: package private Table
class with visibilities shown:
Generic classes can be shown as given below. The notation format is shown on the left, followed by two examples.
You should use a solid line to show an association between two classes.
This example shows an association between the Admin
class and the Student
class:
Use arrowheads to indicate the navigability of an association.
In this example, the navigability is unidirectional, and is from the Logic
class to the Minefield
class. That means if a Logic
object L
is associated with a Minefield
object M
, L
has a reference to M
but M
doesn't have a reference to L
.
class Logic {
Minefield minefield;
// ...
}
class Minefield {
//...
}
class Logic:
def __init__(self):
self.minefield = None
# ...
class Minefield:
# ...
Here is an example of a bidirectional navigability; i.e., if a Dog
object d
is associated with a Man
object m
, d
has a reference to m
and m
has a reference to d
.
class Dog {
Man man;
// ...
}
class Man {
Dog dog;
// ...
}
class Dog:
def __init__(self):
self.man = None
# ...
class Man:
def __init__(self):
self.dog = None
# ...
Navigability can be shown in class diagrams as well as object diagrams.
According to this object diagram, the given Logic
object is associated with and aware of two MineField
objects.
Association Role labels are used to indicate the role played by the classes in the association.
This association represents a marriage between a Man
object and a Woman
object. The respective roles played by objects of these two classes are husband
and wife
.
Note how the variable names match closely with the association roles.
class Man {
Woman wife;
}
class Woman {
Man husband;
}
class Man:
def __init__(self):
self.wife = None # a Woman object
class Woman:
def __init__(self):
self.husband = None # a Man object
The role of Student
objects in this association is charges
(i.e. Admin is in charge of students)
class Admin {
List<Student> charges;
}
class Admin:
def __init__(self):
self.charges = [] # list of Student objects
Association labels describe the meaning of the association. The arrow head indicates the direction in which the label is to be read.
In this example, the same association is described using two different labels.
Admin
class is associated with Student
class because an Admin
object uses a Student
object.Admin
class is associated with Student
class because a Student
object is used by an Admin
object.Commonly used multiplicities:
0..1
: optional, can be linked to 0 or 1 objects.1
: compulsory, must be linked to one object at all times.*
: can be linked to 0 or more objects.n..m
: the number of linked objects must be within n
to m
inclusive. In the diagram below, an Admin
object administers (is in charge of) any number of students but a Student
object must always be under the charge of exactly one Admin
object.
In the diagram below,
UML uses a dashed arrow to show dependencies.
Two examples of dependencies:
Dependencies vs associations:
Foo
accessing a constant in Bar
but there is no association/inheritance from Foo
to Bar
.An association can be shown as an attribute instead of a line.
Association multiplicities and the default value can be shown as part of the attribute using the following notation. Both are optional.
name: type [multiplicity] = default value
The diagram below depicts a multi-player Square Game being played on a board comprising of 100 squares. Each of the squares may be occupied with any number of pieces, each belonging to a certain player.
A Piece
may or may not be on a Square
. Note how that association can be replaced by an isOn
attribute of the Piece
class. The isOn
attribute can either be null
or hold a reference to a Square
object, matching the 0..1
multiplicity of the association it replaces. The default value is null
.
The association that a Board
has 100 Square
s can be shown in either of these two ways:
Show each association as either an attribute or a line but not both. A line is preferred as it is easier to spot.
UML uses a hollow diamond to indicate an aggregation.
Notation:
Example:
Aggregation vs Composition
The distinction between composition (◆) and aggregation (◇) is rather blurred. Martin Fowler’s famous book UML Distilled advocates omitting the aggregation symbol altogether because using it adds more confusion than clarity.
An object diagram shows an object structure at a given point of time.
An example object diagram:
Notation:
Notes:
car1:Car
are underlined.objectName:ClassName
is meant to say 'an instance of ClassName
identified as objectName
'.:Car
which is meant to say 'an unnamed instance of a Car object'.Some example objects:
UML activity diagrams (AD) can model workflows. Flow charts are another type of diagram that can model workflows. Activity diagrams are the UML equivalent of flow charts.
An example activity diagram:
[source:wikipeida]
An activity diagram (AD) captures an activity through the actions and control flows that make up the activity.
Note the slight difference between the start node and the end node which represent the start and the end of the activity, respectively.
This activity diagram shows the action sequence of the activity a passenger rides the bus:
A branch node shows the start of alternate paths. Each control flow exiting a branch node has a guard condition: a boolean condition that should be true for execution to take that path. Exactly one of the guard conditions should be true at any given branch node.
A merge node shows the end of alternate paths.
Both branch nodes and merge nodes are diamond shapes. Guard conditions must be in square brackets.
The AD below shows alternate paths involved in the workflow of the activity shop for product:
Some acceptable simplifications (by convention):
[Else]
condition.The AD below illustrates the simplifications mentioned above:
Fork nodes indicate the start of flows of control.
Join nodes indicate the end of parallel paths.
Both have the same notation: a bar.
In a , execution along all parallel paths should be complete before the execution can start on the outgoing control flow of the join.
In this activity diagram (from an online shop website) the actions User browses products and System records browsing data happen in parallel. Both of them need to finish before the log out action can take place.
Compared to the notation for class diagrams, object diagrams differ in the following ways:
:
before the class nameFurthermore, multiple object diagrams can correspond to a single class diagram.
Both object diagrams are derived from the same class diagram shown earlier. In other words, each of these object diagrams shows ‘an instance of’ the same class diagram.
When the class diagram has an inheritance relationship, the object diagram should show either an object of the parent class or the child class, but not both.
Suppose Employee
is a child class of the Person
class. The class diagram will be as follows:
Now, how do you show an Employee
object named jake
?
This is not correct, as there should be only one object.
This is OK.
This is OK, as jake
is a Person
too.
That is, we can show the parent class instead of the child class if the child class doesn't matter to the purpose of the diagram (i.e., the reader of this diagram will not need to know that jake
is in fact an Employee
).
Association labels/roles can be omitted unless they add value (e.g., showing them is useful if there are multiple associations between the two classes in concern -- otherwise you wouldn't know which association the object diagram is showing)
Consider this class diagram and the object diagram:
We can clearly see that both Adam and Eve lives in hall h1 (i.e., OK to omit the association label lives in
) but we can't see if History is Adam's major or his minor (i.e., the diagram should have included either an association label or a role there). In contrast, we can see Eve is an English major.