notes - Computer Science

CS 122 – Jan. 9
The nature of computer science
General solutions are better
What is a program?
OO programming
Broad outline of topics:
– Program design, GUIs, recursion, sorting,
• Handout:
– Review of Java (errors, I/O, ArrayList)
Computer science
The study of how we solve problems
How we manipulate and represent information
Ultimately, think of rep’n of brain & knowledge
 Inspiration for hierarchical data structures
• In problem solving, we crave general solutions
– Peanut butter & jelly sandwich  any sandwich
– Average of 3 numbers  No restriction
• What kinds of programs have you seen, written?
• Software can be classified in many ways. First,
consider this breakdown: simple & not-so-simple
• “Simple” problems can be solved easily with 1 source
– Input, calculations, output
– Ex. How many vowels does a 9-letter word typically have?
• Object oriented technique for larger concerns
• Critical for us to design first – why?
• You can find my notes for previous course here
• Java API documentation
• Handout highlights some important things to
remember: common errors, I/O, ArrayList class
CS 122 – Jan. 11
• OO programming
– Good for development of larger programs
– Encapsulation
• Examples
• Handout
– Review elementary Java
OO programming
• In the problem description, look for nouns first
• A noun can be a primitive type or a class
• Often we define our own classes
– E.g. Card, Deck, Board, Player
• An object has
– Attributes (i.e. little nouns)
– Operations
• In a larger program, we have several source files (i.e.
• OO programming means we look for nouns first
• A “noun” could be
– Primitive type
– Class object, a complex noun with attributes + operations
• We define classes that accurately reflect the information in
our program, but not every detail (abstraction).
• Defining your own class might not be necessary.
– Sometimes the type/class we want already exists in the
language or API.
– Your class might have just 1 attribute or just 1 instance
• Separate implementation from use
– Outside class implementation (e.g. in Driver), we should
not be concerned with object rep’n or how operations
– Logic & strategy of a game does not depend on how
information is stored internally
– This is why attributes are declared private
• Only be concerned with “interface” information
– Names of methods (including constructors), necessary
parameters, return type
• Let’s design a small OO program pertaining to
information about mountains.
• Mountain class
– Attributes for: continent, name, height
– Constructors
– Can have get/set methods for each attribute
• Driver class with main( ) function
– Contains the logic that actually does something with the
data. Let’s read a text file containing mountain
information, and list those over 10,000 feet.
• Outline of driver class:
– Read file
– For each line:
• Parse information contained in the line
• Create mountain object
• Add to array or Arraylist
– Close file
– For each mountain
• If it’s over 10,000 feet, print it out
• Have we solved the problem?
• See review handouts
Consider these
(From handout)
• Count the number of words in a text file
• Modify mountain program to display average height
of all the mountains
• Compute the cost of tiling a floor
– What class should be defined? Its attributes & operations?
CS 122 – Jan. 13
Briefly discuss room tiling problem
• Handouts:
– Encryption (interfaces)
– Shapes
(Please show me your lab if you have not already done so.)
• If you can tile/carpet a single room, you can do a
whole house, etc.
• A House contains Rooms
– Create a House class that has an attribute representing a
collection (i.e. ArrayList) of Room objects
– To find total cost for house, write a loop that traverses this
list of rooms, finding the cost for each one.
• Can extend outward to more layers: Subdivision, City
• This class relationship is aggregation, or “has a”
• After extending our Room program to include a
House class, we now have 3 classes:
– Driver class
• Create a House object
• Find the total cost to tile/carpet this house
– House class
• Contains collection of Rooms
• For each room, call appropriate room.findCost( )
and add them all up.
– Room class
• Already has a formula to find its cost
• How is an OO program organized?
• Look at Driver/main( ) to see what a program is
• Look at the other classes to see what the program is
made of (big nouns), and what it is capable of
Class building helpers
• Designing and implementing a class can be time
consuming. There are 2 OO features to help. They
start with the letter “I”
– Interface
– Inheritance
• An interface is a to-do list of methods we want for a
• An interface is not a class!
• We say that a class implements an interface.
• It’s possible for 2+ classes to implement the same
• Geometrical shapes like Rectangle, Circle
– See handout
• Encryption
– Different ways to do it
– Need to define functions to encrypt & decrypt
– Many possible implementations!
• Note: compiler will complain if you forget to
implement all methods listed in interface
• Another class relationship
– “is a” rather than “has a”
• Useful for what?
• To improve existing class, add functionality
– Random has nextInt( ) and nextDouble( ), but not
nextLetter( ) or nextWord( )
• To make a more specialized class
– Publication  Book or Magazine
– Animal  Fish, Reptile, Bird
• When you inherit something, you can override it
Shape example
• Shape class
• Rectangle, Cube and Sphere inherit from shape
• 3-D inherits from 2-D
– Yes, you can have inheritance among interfaces!
• Rectangle implements 2-D
• Cube and Sphere implement 3-D
• Can represent all these relationships with UML
CS 122 – Jan. 18
• Inheritance
• Polymorphism
• Handouts:
– Zoo
– Speaker
– Pet
Zoo program
• What inheritance relationships do we have among
the classes: Animal, Bird, Fish, Penguin Reptile,
• ArrayList<Animal>
• “super” is called implicitly, for default constructor
– For other kinds of constructors, call super( ) first to
initialize inherited attributes.
• exercise( ) uses instanceof – saves us from having to
implement exercise( ) in every subclass
• We override toString( )
• When a method is called, how do we know where to
find its implementation?
Penguin p = new Penguin( ); );
– Which fun( ) is being called?
– What if fun( ) does not exist in
• You can look up, but not down when looking for
– E.g. equals( ) and toString( ) are defined in Object, and we
normally override these.
• What does this word mean?
• At any time, an object can change its dynamic type
by a call to a constructor.
• When declared, object is given a static type to ensure
program will compile.
• Examples
– Speaker (interface)
– Pet (inheritance)
CS 122 – Jan. 20
• Handling exceptions
• Begin study of GUIs
– Start with simple applets
– Responding to events
• Handouts:
Steps in writing a GUI
Applet examples
• What can go wrong when a program runs? (run-time
– File not found, number format, etc.
• Important errors have their own classes that we can
• To allow program to continue gracefully, use
try/catch block
• Example: Read a menu, and average the prices,
ignoring all the words
– See handout
Steps in creating GUI
• Define & initialize the components you want
– Where? In main( ), init( ), or constructor
– Component = object to show on screen
• Decide how things should be drawn
– Details in paint function
• Create listener objects
– Figure out which components (e.g. button) will generate
events for this listener.
– Define how listener should respond to event
Typical Applet Structure
• My applet class
– Attributes
– Constructor
• Inner Listener class – somewhere in here call repaint( )
• Create objects that we will draw
• Call the listener
– Paint method
Applet examples
• Applet = small program, not stand-alone application.
To be run in a browser.
• In each program, you’ll see 3 parts
– init( ) or constructor
– paint( ) if we want to draw geometric shapes
– Listener class, with object declared in init( )
• FirstApplet – doesn’t have listener. We just take
input string and go.
• MouseApplet3 – User can click, and blue square
follows! Look at mousePressed( )
• MouseApplet4 – alternate blue/red
Applets, continued
• In mouseApplet4: How is the grid color changing?
• How often does a certain code fragment get
– init( ), constructor  once at the beginning
– Listener  when the event occurs
– paint  when we need to refresh the screen
CS 122 – Jan. 23
Review mouse applets
Applet that draws polygon
“Containment hierarchy”
GUI Application
• Handouts:
– Draw polygon (PolygonApplet)
– Containment hierarchy
– EasyFrame
Drawing polygon
• GUI components:
– Frame contains panel, which contains: 2 Labels, 2 Text
fields, 1 Button
• Need to attach listener to button!
• Data structure for our polygon
– ArrayList of Point objects
– When ready to draw, create Polygon to paint in applet:
• X-array
• Y-array
• Number of points
We can create Polygon in the listener or in paint( )
• GUIs have components, and some components
generate events
– Components are placed into “panels”
• Ex. Polygon applet
– Components of the pop-up frame:
Frame has a Panel
Panel has 2 labels, 2 text fields, 1 button
– Attach listener to button, so we know how to add a new
– Draw polygon using x array, y array, # points
• Containment hierarchy for realistic GUI:
– Usually 1 frame (window), comes with its own default
panel: “content pane”
– Create more panels to add to content pane
– Put components into panels; panel has some layout that
determines where stuff goes
• What events could take place?
– ActionEvent for button
– ChangeEvent for slider
– CaretEvent for typing in text field (ignore this one)
Don’t memorize – these events are listed in handout
First examples
• “EasyFrame” has no events, but is application.
– Note use of JFrame. This is the starting point for any GUI
• Containment hierarchy can look like this
• Frame
– Panel
• North panel
– 2 labels
• West panel
– Panel for first name
» Label
» Text field
– Panel for last name
» Label
» Text field
CS 122 – Jan. 25
• Any questions on containment hierarchy?
– Note that we define frame in driver, and have separate
classes for panels
– In panel constructor, good place to add listener
• Handouts:
– Text area
– Tabs & layouts
– Rectangle frame (with control panel)
First examples
• “EasyFrame” has no events, but is application. 
– Note use of JFrame. This is the starting point for any GUI
• “Text area” example:
– Scroll pane
– Has an input frame, and an output frame
– Listener for button
Tabs & layouts
• There are different ways to lay out components in a
• In the “Tab” program, a frame contains a “tabbed
pane”, which contains 5 panels
• 4 panels show possible layouts
Flow (default)
Border (when you add stuff, say where it goes)
Box (uses “glue”; when initialized, specify vert or horiz
• With JLabel, you can center / justify text.
• We can draw geometric objects in panel instead of
– This program is similar to one in the lab
• Frame contains 2 panels
– Specified in frame’s constructor
– Input panel listens to Action event from a button
– Output panel repaints the rectangle
• paintComponent( ) is analogous to paint( ) that you
saw in applets.
CS 122 – Jan. 27
• More interactive features in a GUI!
– Radio buttons
• Button group to enforce mutual exclusion
– Check boxes
• Making a GUI easier to use
• Handouts:
– Pizza (choices)
– Day/night (special features)
Pizza GUI
• Allow user to make choices
– Radio buttons – mutually exclusive choice
– Check boxes – can have any # checked
– We’ll use the isSelected( ) method
• Any change will affect price, which is shown in a
JLabel object. Can use the same listener. 
• Button group for radio buttons
– to make sure only 1 can be selected
• Where / how do we update price?
• I decided not to have separate classes for each panel
Day / Night GUI
• User-friendly features in a GUI
– Enable / disable a button whenever appropriate
– Mnemonics
– Tool tips
• We have a picture panel and button panel
– We have listener for each of 2 buttons
– One panel needs to talk to the other  send parameter!
• Can you guess appearance / behavior from the code?
When are the various functions invoked?
CS 122 – Jan. 30
• Jlist
– Maintains a list of objects (e.g. file names)
Communication between panels
Review event train of thought
Border, split pane, scroll pane
– JList (pick image) and borders
– JSlider
– Events
Review events
Must implement
To get input, use in
mousePressed( )
getX( ), getY( )
[text field].getText( ) actionPerformed( )
button].isSelected( )
ListSelectionListener valueChanged( )
getSelectedValue( )
getValue( )
stateChanged( )
CS 122 – Feb. 1
• Recursion
– Definition
– How function calls work
– Purpose & examples
• Handouts:
– Odd number / factorial / Fibonacci
– Exponential and Pascal formulas
• When a function calls itself
• A more technical definition – within a single thread
of control, when a function is called and an earlier
instance of the same function has not yet returned
• It’s a problem-solving technique
– You have a problem but only know how to solve a small
– Break a problem down until it can be expressed in simple
Function calling
• Review what happens when main calls f, and f calls g.
– While in g, how do we know where we’ve been? How do
we know where to return to? Function calls to g may
appear many times throughout the program.
• In recursion, the common scenario is: main calls f,
and f calls itself!
– When f calls itself, it’s not yet returning
– How does process not go on forever?
– Need a “base case”; it must always be achievable.
• Direct vs. indirect recursion
• We’ll start with math formulas, as in defining a
sequence of numbers
• You’ll notice definitions with 2 parts:
– Base case
– Recursive case
• How would you define these sequences?
8, 13, 18, 23, 28, …
3, 6, 12, 24, …
• Well known recursive problems:
– Factorial, Fibonacci numbers, Pascal’s triangle
Thinking recursively
• Recursion is a different way of approaching the
problem of having to repeat something
– It’s accomplishing the same thing as a loop
– Theoretically, loops and recursion are equivalent, and we
can convert one to the other.
• Express some problem in simpler terms – an
important skill
– Ex. fact(6) = 6 * fact(5)
– How does this continue?
Formula examples
• Let’s examine recursive implementations of
– Computing the nth odd number
– Factorial
– Fibonacci
• Refer to handout
CS 122 – Feb. 3
• For HW: how to check cube color validity?
• Simple recursion examples
– Exponentiation
– Pascal’s triangle
• Number of recursive calls  inefficiency
• Towers of Hanoi
• Handouts:
– Towers of Hanoi
• Recursion can be inefficient when you run it
• For Fibonacci and Pascal’s triangle, the number of
function calls increases rapidly. Why?
• How many total function calls are needed for
– Pascal(a, b) calls Pascal(a – 1, b – 1) and Pascal(a – 1, b)
– We can draw a tree of all the function calls made.
– Implementations vary, but we can encounter base case
whenever b = 0 or a = b.
# calls for Pascal(a, b)
a = b or b = 0 1 call
1 + 1 + myself = 3
calls(2,0) + calls(2,1) + myself = 1+3+1 = 5
calls(3,0) + calls(3,1) + myself = 1+5+1 = 7
1 call
calls(2,1) + calls(2,2) + myself = 3+1+1 = 5
calls(3,1) + calls(3,2) + myself = 5+5+1 = 11
calls(4,1) + calls(4,2) + myself = 7+11+1 = 19
Towers of Hanoi
• How can we move n disks from peg A to peg B if:
– Can only move one disk at a time
– Cannot place a disk on top of a smaller one
– See handout
CS 122 – Feb. 6
• Any repetitive action can be expressed recursively
– In other words, we can convert from loop to recursion
– Some programming languages only allow recursion
• Examples
• Handout:
– Practice recursion
• Earlier we saw how to define a sequence recursively.
What if we interleave 2 sequences?
– 8, 12, 14, 20, 20, 28, 26, 36, …
• Beware of infinite recursion – stack overflow
– In my example “Limit” program, we could call ourselves
thousands of times before running out of memory for the
• For lab, see handout
• How would you approach the problem of
– Counting from 1 to 10? 10 to 1? Counting by 5’s?
Print the alphabet?
– Printing 10 stars? N stars?
– You always need a base case, and a recursive case (at least
1 of each)
• Converting an integer to/from binary string
CS 122 – Feb. 8
• More on number base conversion
• Another array application: search
– Linear search strategy
– Binary search strategy
• (Some searches require backtracking)
• Handouts:
– Recursive searches
– 8 queens
CS 122 – Feb. 10
• Recursion that takes care of mistakes, wrong turns
– Good for various types of searches
• Need to find solution to some problem without
trying every combination
• We’ll look at a couple of examples from chess
– 8 queens problem
– Knight’s tour
• Handout:
– Knight’s Tour
8 Queens
• Is there some way to place 8 queens on a chess
board so that no one can get captured?
– (There are no other pieces on the board, just the queens)
• There are over 4 billion ways to arrange the queens
• Don’t write 8 nested loops – waste
Outline of solution
• Place one queen in the first column.
• Place the next queen in the next column, in a place safe from
• If not able to place this queen, back up!
– It means previous combination of choices does not
produce a solution. We got stuck.
• In the code, look for
– Pre and post condition (comment for us)
– Base case = we’re done
– When we backtrack, see value of done
• To find all solutions, set done to false to pretend we failed.
Print solution in base case.
Knight’s tour
• Is it possible for a knight to make legal moves from square to
square on a (5x5) chess board visiting all the squares exactly
(8x8 board would be too slow)
• Code is a little ugly, due to knight moves and checking if off
the board
• Details
– We start in a corner
– There are 8 possible directions to choose from
– Cells are numbered 1-25
• Actually, an elegant iterative solution exists:
– Start in a corner & continue to rotate in same direction
using outermost possible squares.
CS 122 – Feb. 13
• Backtracking
– We saw some chess examples 
– How would we find all solutions to a problem, not just
• Something amazing: finding our way through a maze
• Handouts:
– Maze
• Famous example of backtracking
• solve( ) is heart of solution – returns boolean
• Start at solve(0, 0)
– Done = false
– If blocked, return false
// base cases
– If at exit, return true
Else // recursive case:
– Mark this square as being tried
– Done = solve (go up)
– if not done, try another direction
If done, mark the successful path  cascade returns
• blocked( ) checks
– Wall
– Out of bounds
– Already tried that cell
• Where do we see backtracking
– In the code?
– In the output?
• Similar problems: 3-D, finding size of contiguous
area or tumor
CS 122 – Feb. 15
• Example: Upside-down numbers
• Defining things with recursion
– Specifically, sets of strings
– Recursive function to test whether a given string has this
– Generating these strings. (How do we decide on length?)
• Handout
– Defining strings
Recursive definitions
• Recursion is good for defining things. If we can
define a set of strings, then we can
– Generate possible strings in that set
– Test for membership – like a compiler checking syntax
• We start with a description of a set of strings
– Ex. Strings consisting of 1 or more a’s.
• Can be helpful to write down example words
• Identify base case, recursive rule
• Write a formal definition (can express it in Java)
– See handout for examples
Accepting a string
• Starts with an ‘a’, followed by zero or more ‘b’s
Function isGood(String s)
if s is “a”, return true
else if s length <= 1, but not “a”, return false
else // length 2 or more
verify that s ends with ‘b’
call isGood(s minus its last character)
• String is of the form anbn
Function isGood(String s)
if s is “”, return true
else if s length is odd, return false
else // even length
verify s begins with ‘a’ and ends with ‘b’
call isGood(s minus first and last char’s)
CS 122 – Feb. 17
• Recursion with sets of strings:
– Accepting a string
– Generating possible strings: how do we decide on the
length?  concept of random recursion
• Handouts
– Accept
– Generate
– Lobster
Random recursion
• Let the machine decide how far to go
• Often, this means we are counting up, rather than
– Ex. To return a random factorial, we effectively multiply 1
* 2 * 3, … and stop at some random place
• Technique is useful when you want to generate
random text
– How many words do you want?
– Inside recursive function, we “decide” whether to take
base case or not.
CS 122 – Feb. 20
• Using recursion to …
– Generate all possible permutations
– Evaluate a numerical expression
• Handouts
– Review for midterm, including old test w/answers
– Permute
– Evaluator
• How do we permute 4 objects { W, X, Y, Z } ?
Write down W, then permute the 3 objects { X, Y, Z }
Write down X, then permute the 3 objects { W, Y, Z }
Write down Y, then permute the 3 objects { W, X, Z }
Write down Z, then permute the 3 objects { W, X, Y }
• See what is happening in general?
– Each object gets swapped to the front temporarily
– Rest of list is permuted recursively
• What is the base case?
Expression evaluation
What is a mathematical expression?
Expr = “terms” separated by + or –
Term = “factors” separated by * or /
Factor = single number or another expression!
• Indirect recursion
• For simplicity, implementation assumes 1 digit
• Program works also for non-nested expressions:
start with these
CS 122 – Feb. 22
• Recap grammar in Old Mother Hubbard
• More about expressions
– Define recursively in form of a grammar
– Introduce alternate notations: prefix and postfix
• Recursively defined image
• Handouts:
– GUI recursion (Fractal)
Making couplets rhyme
• She went to the ____ to buy him a { ____ [
But when she came back, he was ____ the ] ____ }
• The 2nd and 4th blanks must rhyme. Select these
words at the same time!
• She went to the ____ to buy him a (ee….ee)
• Between the rhyming words, we have a phrase that
can be chosen independently.
Expression grammar
• “An expression is 1+ terms separated by + or –”
• “A term is 1+ factors separated by * or /”
• “A factor is a single number or an expression”
expr  term | term + term | term – term
term  factor | factor * factor | factor / factor
factor  number | expr
number  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• In later courses, we’ll look at this in more detail to
work out subtleties.
• There are 3 ways to write down a mathematical
expression. We’ll study them in detail soon.
• Infix (default): operator sits between operands
• Prefix: operator appears before operands
• Postfix: operator appears after operands
• Advantage of prefix & postfix
– No ambiguity about order of operations
– So, we never need grouping symbols
• Can we define prefix and postfix expressions
GUI recursion
• Drawing tiled images
– To make this work, the applet has to know where images
– drawImage() lets us display image without JLabel. It has
several versions based on parameters.
• Fractal image. Organization:
• Applet contains panel which contains:
– Button panel
• 2 buttons
• 2 labels
– Koch snowflake panel
Keeps track of level. For each level, 5 points on the side of a triangle.
CS 122 – Feb. 27
• Design
• Software life cycle
• Classes
– discovery
– Class relationships
• State
• Handouts:
– OO design concepts
– Store design example
• Software life cycle… includes maintaining code
• Don’t try to do everything at once
– Waterfall model: risk of over-designing, creating classes
you’ll never use
– Spiral approach: go thru iterations, phases
• Creating a design
– Discover classes & their relationships
– Consider sequence of events  “states”, although this is
not needed for simple programs
• In real life, classes can be complex, with many
attributes and operations.
• For example: Airline, Army, Bank, Casino, Hotel,
Restaurant, Store
• Programs including such a class may need other
classes as well, such as Employee, Customer, Product
• The state you are in determines the effect /
functionality of input devices (buttons, dials).
– Consider accelerator pedal.
• Class relationships (database) for a Store
• What states would we encounter in…
a microwave oven?
a vending machine?
an ATM?
a calculator?
a casino?
buying an airline ticket?
• A GUI has buttons that need to respond based on
state; multi-purpose buttons.
CS 122 – Feb. 29
• Lab recap (designs)
• ATM design example
• Handouts
Answers to test
Complete list of lab questions
ATM design
ATM button & state table
Other examples
• In any design, we need to consider
– What buttons do we need?
– What events do we pay attention to?
– How do events and states relate?
• For next day, please be ready to discuss:
– Design a child’s game that will display a clock and ask the
user to enter the time. Determine if the time entered is
correct. Keep track of score. Different players may play.
– Design a similar game that tests arithmetic
– Design a vending machine
CS 122 – March 2
• Design examples
– ATM 
– Game
– Vending machine
• What does it mean for a program to be in some
– What states are in a solitaire game?
• Handout:
– Game design
Review design
• For ATM, buttons A, B, C have different purpose
depending on state – that’s why they have generic
– Otherwise we’d need a lot more buttons!
• Take note of
– Attributes and operations of each class
– Table of states and buttons
• For vending machine,
– What buttons do we need?
– What events do we pay attention to?
– How do events and states relate?
Vending machine design
• Machine contains products
• Product has price, quantity – can be replenished
• Need to accept coins and user selection as “input event” that
may signal a state change.
• Machine class:
Product class:
– Attributes are:
product [ ]
password / key
– Operations are:
emptyMoney( )
create all products
Attributes are:
Operations are:
buy( )
replenish( )
Need coins
& selection
Coin return
Operator key
Grab selection from
text box
If money >= price
--# product
Refund if excess
Grab password
If correct,
state = operator
State = coins
Grab choice
Go to appropriate
Incr # product
State = operator
Empty me
totalMoney = 0
State = operator
CS 122 – March 12
• Sorting
– Many possible ways to accomplish the task
• Brings up concept of algorithm efficiency
• Handout:
– Selection and insertion sort
• Some methods do better depending on type of data
or how distributed
• Easy algorithms are not the fastest
–  Most efficient algorithms are clever
• One possible method, “swap” sort:
for i = 0 to n – 1,
for j = i+1 to n – 1,
if a[ i ] and a[ j ] are out of order,
swap them
Famous methods
• Selection sort: Find the largest value and swap it into
first, find 2nd largest value and put it 2nd, etc.
• Bubble sort: Scan the list and see which consecutive
values are out of order & swap them
• Insertion sort: Place the next element in the correct
place by shifting other ones over to make room
• Swap sort: For each pair of elements, see if they are
in correct sequence. If not, swap them.
Check out
• Enjoy this Web site demo of sorting methods
• Doing operations in parallel can speed up sorting
– In bubble sort, we can swap #1 and #2 at the same time as
#3 and #4, etc.
CS 122 – March 14
Recursive sorts
• Merge sort: Split list in half until just 1-2 elements.
Merge adjacent lists by collating them.
• Quick sort: We want “smaller” elements on left, and
“larger” elements on right. Recursively continue on
these 2 subarrays.
– Can be tricky to code
• Handouts:
– Code for recursive sorts
– Quick sort
Quick sort
• Reminiscent of merge sort, but the tough part is how
we partition the (sub)-array
• Given a slice of the array a[ p .. r ],
– Need to decide on suitable pivot/threshold value.
– Start with left and right hands at both ends
– Work inward. Stop when you see
• Big number (> pivot) on the left
• Small number (< pivot) on the right, and
– Swap numbers if necessary
– The loop stops when the hands converge. This marks end
of the first partition for the recursive calls.
CS 122 – March 16
• Sorting using the Java API
• Radix sort
• Bucket sort
• Handout:
– Sorting using the API
Sorting with API
• If a is an array of primitive values, use
• If a is an array of objects, use
Arrays.sort(a, comparatorObject)
• If a is an ArrayList, use
• Purpose of comparator is to tell Java how to compare
2 values belonging to some class. We need to create
our own comparator class first!
• Arrays and Collections also have other helpful
methods for max, min, search, shuffle….
Special sorts
• If we make some assumptions about the input data,
it is possible to sort in linear time. In other words, by
only using singly nested loops.
• Radix sort
– Dates from the time of punch cards
– Assume an upper bound on # digits of each number
– For each decimal place from right to left:
• Create 10 buckets labelled 0-9
• Place each number into bucket based on its digit @ this
decimal place
• Concatenate the buckets
Bucket sort
• Works well if values evenly distributed between
lowest and highest
• Partition the range of values into n buckets
– Typically n = size of array
– For example, if you want 10 buckets, and the range of all
values is 0-200, then the first bucket holds values 0-20, the
next bucket holds values 20-40, etc.
• For each value in array, place into appropriate bucket
• Within each bucket, sort elements using any method.
Ideally there would be very few elements per bucket.
• Concatenate the buckets.
CS 122 – March 19
• Finish sorting
– Counting sort
– Stooge sort
• Collections (data structures)
– Aggregate data types other than array
– Defining our own data types that can hold many objects
– Example: “Bag” data structure
• Handout:
– Halloween (bag with ArrayList)
• Counting sort
– Good when you have small number of distinct values or a
small range  repetition
– Use 2nd array to record distribution of values
– Use distribution to tell you how many of each value to put
into final array
– Example
A = [ 3, 1, 2, 1, 2, 0, 3, 0, 1, 3, 2, 3 ]
C = [ 2, 3, 3, 4 ]
B = [ 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3 ]
• Stooge sort
– Needlessly complex, approx. on order of n2.7
– Purpose purely to study complexity of algorithms,
originally a textbook exercise
– Given a (sub) array a[p..r],
First, swap endpoints a[p] and a[r] if they are out of order
if p..r consists of 3+ elements,
recursively call yourself 3 times: first 2/3 of list, last 2/3
of list, first 2/3 of list again
• Very often we want to maintain a lot of data of the
same type
• Many ways to conceive of a collection of data
• May be linear or nonlinear
• A general procedure:
– Write an interface – what ops should it do?
– Choose an underlying representation. Then you are ready
to implement the interface
– Test the implementation. If not satisfied, you may need to
change the rep’n
• Common operations to a collection include:
– Insert, delete, search/get, sort, shuffle, how many, print
whole thing out
• Design questions for the rep’n
Where do elements get added? Does it matter?
Can I have duplicate values?
Are the values always sorted?
Among (add, search, remove), which operation will be
performed most frequently?
– Answers to these questions will help decide on rep’n
• Once you have a collection implemented, may be
helpful to have systematic way to visit all elements.
• Iterator = another class whose sole purpose is to
traverse your collection
– A collection may have > 1 type of iterator.
• Iterator is an interface in the Java API. Methods:
– next( )
– hasNext( )
– remove( ) -- we rarely need this one
• Here’s one way to include an iterator with your
• Somewhere you need to define the iterator class
– Implement the 3 required methods
– Since we are not interested in removing elements, just
throw UnsupportedMethodException
• In your collection’s implementation, include a
method that will return an iterator object.
• Whoever has an iterator object (e.g. Driver class) can
call next( ) and hasNext( ) as needed.
Example: Bag
• Let’s create our own collection: the Bag
• Purpose – to simulate contents of a bag/sack/box –
we are generally uninterested in object’s location
inside the Bag.
– Drawing numbers for bingo or similar game.
• Bag interface should include what operations?
– In particular, what operation(s) might a Bag have that
other collections would not? And vice versa?
CS 122 – March 21
• Data often collected into some kind of collection or
data structure 
– May be linear or nonlinear, as appropriate
• Another linear data structure: linked list
• Handouts:
– Linked list guide
– Linked list implementation
Linked List
• Linear data structure, with individual elements called
• Unlike an ArrayList, the individual elements are not
– There is no direct way to obtain, say, the 10th element.
• Very little information is needed to maintain list:
– List itself consists of a head and tail. Head refers to first
element in the list. Tail = last.
– Each node in the list knows its previous and next element
• So, instead of an array(list) of data, we have a linked
list of node objects, each of which holds some data
LL Implementation
• First, create Node class
– 3 attributes: data, prev, next
– The data can be of any type. For this reason, we typically
define Node generically.
– The prev and next are also Node objects.
• LinkedList class
– Attributes: head, tail (optional: # of elements)
– Operations: insert & remove are most important!
– What else?
– (Almost every class should have toString( ), at least for
LL handout
• Examine the various operations such as remove( )
and sort( )
• We can test the LL implementation with Driver and
Tester classes.
– Prints node address so we can see the actual “links”
• We have a separate Node class
Notes on LL
• Be sure to work out insert/delete operations. There
are special cases to consider (see handout).
• Singly linked versus doubly linked
• Compared to ArrayList, which LL operations are
slower, faster?
• The Java API already has a LinkedList class. Creating
our own is a good exercise to learn how to
implement collections, but in practice not necessary.
CS 122 – March 23
• 2 more linear data structures:
– Stack 
– Queue (we’ll study this a little later)
• Handouts:
Intro to Stacks and Queues
Stack implementations
Application: Balanced parentheses
Application: Postfix evaluation
• A linear data structure where we insert & remove
elements from the same end, e.g. the “top”.
“Last in – first out” philosophy.
The insert operation is called push
Delete operation is called pop
For example, what is left in a stack after performing these
push(3); push(7); push(4); pop( ); push(1); pop( ); pop( );
• Stacks are good when you want things to be
processed backwards & forwards
– Ex. Maze; certain card games; Recursion & backtracking
• Stack Interface
– Useful operations: push, pop, peek, size, toString( )
• (Several implementations possible based on rep’n)
• Representation: ArrayList of any type of object.
– In other words, we can define a generic stack to make the
implementation as general (useful) as possible
• Boundary cases
– Need to handle empty stack situation. When is this a
– Could also impose a logical limit on size. How? Why?
• API has a Stack implementation
• Some applications we will examine:
– Checking balanced parentheses in a mathematical
– Evaluating a postfix expression
– Converting an infix to a postfix expression
• You will see stacks in later courses
– When a computer program runs, the system must
maintain a run-time stack to keep track of nested function
calls. When a function returns, how does the system
remember where to return back to?
Balanced ( )
Push every (
Pop a ( on every )
Ignore all other characters
Result: When do we say input is OK?
– Stack should be empty when done with input.
– Bad if we have extra ( at the end, or no ( to pop when we
see a ).
– Need to guard against a run-time error (empty stack).
Evaluate postfix
• Every operator needs 2 operands (the most recent 2)
• So, push numbers on the stack
• When you see an operator:
Pop a number, and call it y
Pop another number, and call it x
Evaluate x OP y
(don’t get this backwards!)
Push the result
• When done, pop the answer.
Our postfix expression is: 25 5 – 6 7 + * 9 +
When you see a number… push.
When you see an operator… pop 2, evaluate, push.
When no more input, pop answer.
20 260 260 269
CS 122 – March 26
• Math expressions
– Evaluating postfix
– Evaluating prefix
– Converting from infix to postfix
• Handout:
– Infix to postfix worksheet
– Infix to postfix ( incomplete for lab)
Evaluating expressions
• How can a computer evaluate a numerical
• First, need to introduce some notation:
– Infix: operator goes between numbers
– Prefix: operator goes before numbers
– Postfix: operator goes after numbers
• It turns out that evaluating postfix expressions is
relatively quick & easy.
• So, our approach is a 2-step process:
– Convert an infix expression into postfix
– Evaluate the postfix expression
Eval prefix ?
• Incidentally, how would you evaluate a prefix
expression? Not as easy. Try some examples:
• +23
Push +, push 2 …
• +–234
Push +, push -, push 2, …
• +23–4
Push +, push 2, …
Infix  Postfix
• Observations:
– The order of the numbers inside the expression does not
– Operators appear later in the postfix version, and possibly
out of their original order
• Let’s look at simple examples first
– How does 1 + 2 become 1 2 + ?
– How does 1 + 2 – 3 become 1 2 + 3 – ?
– Compare what happens to 1 + 2 * 3 and 1 * 2 + 3. What is
different, and why?
• 1 + 2 * 3 becomes 1 2 3 * +:
push +, push *, and when done pop *, pop +
• 1 * 2 + 3 becomes 1 2 * 3 +:
push *, pop * & push +, when done pop +
• 1 * 2 / 3 + 4 – 5 becomes 1 2 3 * / 4 + 5 –:
push *, pop * push /, pop / push +, pop + push -,
at end pop • 1 + 2 * 3 – 4 becomes 1 2 3 * + 4 –:
push +, push *, pop * pop + push -, at end pop -
More examples
• ( 1 + 2 ) * 3 becomes 1 2 + 3 *
What happens when we get to the ) ?
• 2 – (3 – (4 – 5 * 6) + 7) * 8 becomes
push -, push (, push -, push (, push -, push *,
at ) pop *-(, at + pop – push +, at ) pop +(, …
• Contrast these similar cases:
1 – 2 – 3 becomes 1 2 – 3 –
1 – (2 – 3) becomes 1 2 3 – –
Moral of story
• Don’t worry about the operands. Just print them out
as you see them.
• Always push (
• When you see ), pop until you see matching (
• When you see an operator, pop those of same or
higher precedence. Stop popping when you get to (
or an operator of lower precedence. Then push me.
CS 122 – March 28
• Collections
– array, ArrayList, LinkedList, Stack, Queue, …
• Issues
– How do you want to use your data?
– How should they be (logically) arranged? In a linear fashion?
• Let’s look at queues and hashing
• Handout:
– Queue
– Hashing (to see how fast searches are)
• Similar to stack, except:
– Philosophy is “first in, first out” not “last in, first out”
– As a result, we add elements to one end of the queue, and
remove from the opposite end
– Names of operations: enquque and dequeue
• Purpose
– Collect data before it can be processed
– Data needs to be buffered for some time (Has this
happened to you?)
– Producer / consumer, Reader / writer, input / output,
client / server
• Can be tricky to implement with array
• (Java API has no Queue class)
Set interface
• Often we want to store values in a collection, and our
situation calls for:
– No duplicates
– Order of elements doesn’t matter (freedom to place
elements arbitrarily)
• Desired operations:
– Add, remove, contains, iterator
• Java API has 2 Set implementations that are more
efficient than ArrayList or Linked List.
– HashSet and TreeSet
•  Important to understand concepts of hashing &
• Hash table = A fancy array-type of data structure
• Purpose
– We want quick search; respectable insert, delete times
– May want to access any element any time, so we don’t
want the restrictions of stack or queue
– There is no notion of a previous or next value
– We have no need to sort the data
– Our “bag” data structure could be a hash table if we desire
• Major feature: each element has a hash value or hash “code”
that determines where into the hash table the object belongs.
How to hash
• Where does an object go into a hash table?
• The object needs to have some intrinsic hash value.
Can be determined by an instance method
hashCode( ).
– Example of hash code could be some unique attribute
value, such as ID number.
• Next, we use a hash function to convert a hash value
into the index into the hash table.
– Why? Because hash values can be very large.
– Use a mod function based on the size of the hash table.
In other words
• Hash values are usually not very interesting.
• hashCode( ) functions just have to make sure that
the objects have a good chance of having unique
hash values.
• The interesting part of hashing is figuring out where
data winds up in the hash table – this is determined
by the hash function.
• Hash table size is usually a prime number.
– We would like objects to map to different places in the
hash tables. Otherwise we have a collision.
CS 122 – March 29
Hash code 
Hash function 
Collisions, and what to do about them
Java’s “Hashtable”
– An array-type data structure allowing you to index on any
value, not just a number.
– Hashing implementation done behind the scenes.
– Useful methods: put, get, containsKey, keys
• Handout: Hashtable of foods
• Let’s suppose our hash table has size 11.
• A simple hash function would say
index = hashCode % 11
• What happens if we insert the values 23 and 34?
23 % 11 = 1, and 34%11 = 1.
Both values map to the [ 1 ] entry of the table.
• Ex. What objects can live at position [ 0 ] ?
• There are 2 ways to resolve this situation.
– Chaining: allow multiple values to reside inside each cell
of the hash table.
– Probing: On a collision, find a different nearby place to sit.
• Basically it means that each cell of the hash table
maintains a linked list of objects. Each object has a
hash code that maps to this address.
• The Java API HashSet class uses this technique.
• Also known as open addressing
• If cell c is already occupied, find a nearby alternative.
• Linear probing
– Try c+1, c+2, c+3, … until you find a space
– At end of the table, wrap around to [ 0 ] and continue
– Try example with size 11 table: 24,45,3,87,65,32,1,12,17,
using h(k) = k mod 11.
• Quadratic probing
– Try c+1, c+4, c+9, … instead
• Example: Apply hash function h(k)=(2k+1) mod 7 for
k = 10, 15, 5, 8, 3, 20, 6. What happens?
Load factor
• Tells you how much of the hash table is occupied
• In practice, when load factor exceeds some
threshold, we enlarge hash table and rehash.
• In Java’s HashSet, the initial size is 16, and the
threshold load factor is 0.75 before doubling size.
CS 122 – April 2
• The last major subject of course is nonlinear data
– Graphs (e.g. network)
– Trees (e.g. hierarchy)
• Consider graphs first, since trees are a special case
– To model something that “looks like” a graph
– Relationships between objects we can quantify/measure
• Graph terminology
• Creating our own Graph class:
– Representation, desired ops, applications
• A nonlinear data structure
• Useful to model any kind of network, or set of relationships
• Questions we may want to ask:
– How many vertices / edges are there?
– Does an edge exist from x to y?
– How far apart are x and y?
– How many edges incident on x? (i.e. find the degree)
– How many nodes are within some distance from x?
– Is y reachable from x?
– Is there a systematic way to visit every node and return
back to the beginning?
Graph ideas
• A graph consists of
– A set of nodes (i.e. vertices)
– A set of edges
• The purpose of an edge is to “connect” two vertices:
to make them adjacent to each other.
• Sometimes…
– The graph may be weighted: We may want to label a cost
or distance on each edge.
– The graph may be directed: This indicates one-way traffic.
• Adjacency list
– For each vertex in the graph, we maintain a list (e.g. linked
list or array list) of other vertices that are directly
connected to this one
• Adjacency matrix 
– A 2-d array
– The vertices are in some order, such as alphabetical order
(A, B, C, …)
– The entry in row / column indicates whether there is an
edge or not. (1 or 0)
– Elegantly handles weighted and directed graphs.
What can we
determine about the
graph, when given its
adjacency matrix?
Adjacency matrix
• For an ordinary graph
– The matrix is symmetric. For example, if A is adjacent to C,
then C is adjacent to A.
– No vertex is adjacent to itself. So, the main diagonal is all
• For a directed graph
– Matrix not likely to be symmetric
• For a weighted graph
– We still put 0’s along the main diagonal, to indicate zero
cost or distance.
– To represent non-adjacent, use ∞ value.
Possible design
• Graph Attributes
– # vertices, # edges
– List of vertices
– Adjacency matrix
• Graph Operations
Create empty graph
Add vertex; add edge
edgeExists(v1, v2)
Neighbor iterator(v)
BFS & DFS iterators (v)
Is connected, etc.
• Design of Vertex depends
on graph’s application…
• Vertex attributes
Others depending on app.
• Vertex operations
– Mark / unmark / is marked
– Get and set level
– equals
Some details
• What do entries in adjacency matrix mean?
not adjacent (be careful)
• addVertex( ) – need to re-allocate array space
– This is the price we pay for using array not ArrayList
• addEdge( ) – the no parameter version uses default 1
• findVertex(String) – used by iterator, edgeExists( ),
addEdge( )
• neighborIterator( ): just look at 1 row of adj matrix!
CS 122 – April 4
• Graph implementation
– Relies on Vertex class
– Array representation of Vertices and edges
– Desired operations?
• Neighbor iterator
• Handout:
– Graph
• Attributes: Vertex [ ], int [ ] [ ]
• When creating a graph we may do operations in this
– Create empty graph (in default constructor)
– Add in all the vertices, then the edges
• Desired operations?
Add vertex; add edge
Return # vertices; # edges
See if some edge exists (or return the distance label)
Find degree of a vertex
Neighbor iterator
• Besides basic graph operatons, we may have specific
questions, depending on why we are using a graph
• Finding a path (sequence of vertices) between 2
– In particular, find the shortest path
• Finding a cycle
• Seeing if the graph is connected or not
• Let us visit all the vertices 
• Here is one way to implement an iterator
• Create an inner class, containing
Attributes needed during the traversal
Constructor initializing attributes
hasNext( )
next( )
remove( ) – honestly, we won’t need this so you can leave
its implementation empty
• Create a method that returns iterator object
• In driver/test class, use hasNext/next as needed
Neighbor iterator
• A systematic way to traverse all neighbors of a vertex
• Read across one row of the adjacency matrix
– Row number will remain constant
– Column number will change
• Constructor: initialize column number to first
column representing a neighbor
• hasNext: have we reached the end of adj mat?
• Next: return corrent column #, and advance it to
next neighbor’s column
• (Alternative approach: create ArrayList in
CS 122 – April 11
• Systematic ways to visit all vertices in a (connected)
– Breadth-first search (BFS)
– Depth-first search (DFS)
• Handouts:
– Worksheet for BFS, DFS
– BFS / DFS code
Breadth-first search
• We assign a level to each vertex.
• Start with some vertex, and assign it level 0.
• Fan out in all directions at once (well, in some order).
– All the neighbors of the first vertex become the “level 1”
– Neighbors of the level 1 vertices become “level 2”
– In general, at level n, all the neighbors that do not already
have a level number become “level n+1”
– Continue until all vertices assigned.
Depth-first search
• Start at some vertex.
• Continue down a path as far as you can until you
reach a dead end.
• Then you backtrack to some point where you had a
choice / fork in the road, and you now try a different
• For this to work, there needs to be a way to select a
“next” vertex to go to – we use the neighbor iterator
(e.g. alphabetical order).
Using BFS / DFS
• Iterators for BFS or DFS can be used to answer
questions about a graph. Let’s try these…
• Finding path between 2 vertices
– DFS: handle it like a maze
– BFS: may be easier, and can find all paths simultaneously
• How to find a cycle
– Which iterator should we use? The Vertex “mark”
attribute may be useful here.
• Is everybody connected?
CS 122 – April 13
• Important graph application:
Dijkstra’s shortest path algorithm
• Handout:
– Dijkstra worksheet (2 pages: instructions plus examples)
Dijkstra’s algorithm
• How do you find the shortest path in a network?
• General case solved by Edsger Dijkstra, 1959
• Let’s say we want to go from “A” to “Z”.
• The idea is to label each vertex with a
number – its best known distance from A.
As we work, we may find a cheaper
distance, until we “mark” or finalize the
1. Label A with 0, and mark A.
2. Label A’s neighbors with their distances
from A.
3. Find the lowest unmarked vertex and
mark it. Let’s call this vertex “B”.
4. Recalculate distances for B’s neighbors via
B. Some of these neighbors may now
have a shorter known distance.
5. Repeat steps 3 and 4 until you mark Z.
First, we label A with 0. Mark A as final.
The neighbors of A are B and C. Label B = 4
and C = 7.
Now, the unmarked vertices are B=4 and C=7.
The lowest of these is B.
Mark B, and recalculate B’s neighbors via B.
The neighbors of B are C and Z.
– If we go to C via B, the total distance is
4+2 = 6. This is better than the old
distance of 7. So re-label C = 6.
– If we go to Z via B, the total distance is
4 + 3 = 7.
Now, the unmarked vertices are C=6 and Z=7.
The lowest of these is C.
Mark C, and recalculate C’s neighbors via B.
The only unmarked neighbor of C is Z.
– If we go to Z via C, the total distance is
6+4 = 10. This is worse than the
current distance to Z, so Z’s label is
The only unmarked vertex now is Z, so we
mark it and we are done. Its label is the
shortest distance from A.
Postscript. I want to clarify something…
The idea is to label each vertex with a number
– its best known distance from A. As we
work, we may find a cheaper distance,
until we “mark” or finalize the vertex.
When you mark a vertex and look to
recalculate distances to its neighbors:
– We don’t need to recalculate distance
for a vertex if marked. So, only
consider unmarked neighbors.
– We only update a vertex’s distance if it
is an improvement: if it’s shorter than
what we previously had.
Dijkstra path
• The answer is not just a number
• In implementation, need some way to store the
actual path taken.
• Possibilities:
– When Dijkstra’s algorithm is done, work backwards from
the destination vertex to deduce the path. Look at the
vertex labels of Z and its neighbors to see which neighbor
logically comes before Z. For example, if label (Y) + edge
YZ = label(Z).
– During Dijkstra, when labeling a vertex, also store the list
of vertices encountered along the way from A.
CS 122 – April 16
• Tree: a special kind of graph
– Good for storing information hierarchically
– Often we want (rooted) binary trees
• Traversing a tree
– More interesting options besides BFS and DFS!
• Handouts:
– Tree with preorder (simple implementation)
– Tree traversals (worksheet)
• Defined to be a connected acyclic graph
– All vertices are connected to all the others (note that this
does not mean adjacent to all the others).  not
disconnected, no isolated vertices 
– No cycle exists anywhere in the graph
• In CS most trees we use are binary trees
– There is a root vertex at the top
– Each node has up to 2 children.
– Specialized terminology:
parent, left & right children, sibling, ancestor, descendant
• How would we design a Node class for trees?
Trees vs. graph
• A tree is a special kind of graph
• Trees may be:
– Rooted
• Among the rooted trees, one important kind is the
binary tree! In it, each node has left & right subtree.
– Non-rooted. Also called “free trees”
– (We could make use of inheritance if we wanted all these
possibilities implemented.)
• Binary tree class design:
– Attributes for a root and number of nodes
– Constructor(s), size( ), toString( ), preorderIterator( )
Building a binary tree
• Suppose we want to create a mathematical
• Simplest tree is a root with 2 children.
– Ex. 1 + 2: the root is “+”, 1 is left child, 2 is right child
• We don’t add nodes / edges one at a time!
• Instead, combine 2 subtrees with a common parent.
– Ex. For (1 + 2) * (3 + 4), we create little trees for 1 + 2, and
3 + 4, and then combine them with * on top.
• In this application, not interested in removing
Binary tree traversals
• Important operation is to visit all vertices  iterator
– General procedures like BFS and DFS do not take the
hierarchical nature of the data into account. They “skip
around” too much.
• 3 ways to traverse a binary tree, each done recursively.
– Preorder = process this node, and then call myself with
each child
– Inorder = call myself with my left child, process this node,
then call myself with my right child
– Postorder = call myself with each child, then process
– If “process” means print, traversals give rise to 3
exprression notations: prefix, infix and postfix.
• Is it possible to determine the tree, given a traversal?
• If we’re given an expression in prefix, infix, or postfix
notation, we can build the tree.
• Some notes:
– Prefix and postfix notations never need parentheses
– In prefix, an operator is applied to the 2 “numbers” (atoms
or subexpressions) that follow
– Similarly, in postfix, an operator is applied to the 2
numbers that precede it.
• A little more difficult for general binary tree.
Tree & traversal
• Given a (binary) tree, we can find its traversals. √
• How about the other way?
– Mathematical expression had enough context information that 1
traversal would be enough.
– But in general, we need 2 traversals, one of them being inorder.
• Example: Draw the binary tree having these traversals.
– Hint: End of the postorder is the root of the tree. Find where the root
lies in the inorder. This will show you the 2 subtrees. Continue with
each subtree, finding its root and subtrees, etc.
• Another example...
CS 122 – April 18
• Review tree traversals
• Binary Search Trees
• Handout
– 5 ways to arrange 3 nodes
Tree & traversal
• Given a (binary) tree, we can find its traversals. √
• How about the other way?
– Mathematical expression had enough context information that 1
traversal would be enough.
– But in general, we need 2 traversals, one of them being inorder.
• Example: Draw the binary tree having these traversals.
– Hint: End of the postorder is the root of the tree. Find where the root
lies in the inorder. This will show you the 2 subtrees. Continue with
each subtree, finding its root and subtrees, etc.
• Another example...
Binary search tree
• Special kind of tree, an excellent data structure
– Efficient searches, insertions, deletions
– Operations should take logarithmic time 
• In the Java API, the TreeSet class is a kind of binary
search tree called a red-black tree.
– BST with extra functionality to make sure the tree stays
“balanced” as we insert/delete elements
– In ordinary BST, it’s possible for it to degenerate into a
linked list if we’re unlucky
• Binary search idea may already be familiar to you:
guessing games, spell checker.
BST Motivation
• Store data so it will be easy to find later.
– Does this list contain 52?
5,8,12, 13, 15, 18, 20, 25, 30, 32, 36, 38, 44, 45, 58, 61, 62,
77, 80
• Key property of BST: each node’s value is:
– For all nodes: left child  you  right child
– Or better yet: all in L subtree  you  all elements in R
• Then, it’s easy to tell where to search for something,
or where a new element must go.
BST operations
insert (x)
delete (x)
find (x)
predecessor (x)
successor (x)
And we can generalize, in case x is not inside:
– closestBefore (x)
– closestAfter (x)
Search, min, max
• Search(k)
– Just follow binary search strategy.
– Starting at root, follow left or right child… until you find k
or discover it can’t exist.
– May be recursive or just a while loop.
• Next, how would you find …?
– Min key value in the tree
– Max key value in the tree
CS 122 – April 20
• BST operations
– find, insert 
– predecessor, successor,
– delete
• Efficiency of BST
– Tradeoff ?
– balancing
• Pred & succ are analogous/symmetric cases, so let’s just look
at successor function. Try example.
if right(x) not empty
return min(right(x))
// succ is lowest ancestor of x whose
// left child is also ancestor of x
y = parent(x)
while y && x == right(y)
x = y
The while loop goes up
y = parent(x)
the tree until x is the left
return y
child of its parent
Insert, delete
• Insert(k) is like search 
– The new key becomes a left/right child as appropriate.
– Special case if tree initially empty.
• Delete(victim) – Careful… there are 3 cases, depending on
how many children the victim has.
– If victim has no children: it’s a leaf.
Just tell parent to set its left or right child pointer to null.
– If victim has 1 child:
Reset the victim’s parent and child to point to each other now.
– What if victim has 2 children?
Delete, continued
• Deleting a node that has 2 children:
– Replace the victim with succ(victim)
– Now, need to repair pointers where succ(victim) came from.
Turns out that succ(victim) has at most 1 child.
It can’t have a left child… if it did, then it’s not really the succ.
Go back to how we calculated succ(x) to see why it’s true.
So, treat the removal of succ(victim) like the 0 or 1 child delete case.
• Example: insert 10, 5, 16, 3, 7, 12; then delete 3, 16, 10.
• Comparing a TreeSet with an ArrayList, how long
does it take to insert 1,000,000 elements, and then
perform a futile search?
• TreeSet: 10.504s to insert; 0.001s to search
• ArrayList: 1.891s to insert; 0.161s to search
• What is the tradeoff?
• BST operations should take O(log2 n) time.
• But in the worst case, a BST acts like a linked list.
– Example: insert elements in sorted order
• Goal: as we insert elements, the BST should
maintain balance
• How to define/quantify balance?
– We can measure the height of each node
– For each node, the height of its 2 children do not differ by
more than 1.
• What if unbalanced? Rotate.
CS 122 – April 23
• Analyzing binary trees
– Some definitions
– Quantitative relationship between height of tree
and # of nodes. (balance)
Binary trees
• Each node has  2 children.
• Very useful for CS applications
• Special cases
– Full binary tree = each node has 0 or 2 children. Suitable
for arithmetic expressions. Also called “proper” binary
– Complete binary tree = taking “full” one step further: All
leaves have the same depth from the root. As a
consequence, all other nodes have 2 children.
Binary tree properties
• Suppose we have a full binary tree
• n = total number of nodes, h = height of tree
• Let’s establish lower and upper bounds on…
The number of leaves
The number of internal nodes
The total number of nodes
The height (The answer to this one will show us why it’s important for
a tree to maintain balance!)
• Let’s repeat the above assuming it’s a full binary tree.

similar documents