Report

Implementing an ADT • Having the JCFs is great – most of the significant data structures are already available • But you still may need to implement your own – the JCF implementation may not contain the methods you want – the JCF implementation may not be as efficient as what you can create – the JCF may not restrict access like you would like • for instance, Queue is an interface, the implementations of Queue are LinkedList and PriorityQueue, neither of which presents a real Queue – or you are working in a language other than Java? • As a CS student, you need to not only learn the JCFs but how to implement your own ADT – here, we focus on the List as an ADT More • It is important to understand how to use the JCFs because using them will save you work, why reinvent them? – It is also important to understand how to implement ADTs on your own for the reasons stated on the previous slide but is also furthers your understanding of OOP, implementing proper interfaces (not interface classes, but the interface of a class with a user class) and will also let you think about computational and space efficiency of a data structure • There are usually 2 implementation ways to implement a given ADT, an array and a linked list – The array is already known by you – The linked list means that you create a node for each datum to be stored in the container and you link the nodes together What is a Collection? • Generically, a collection is just a group of objects that share the same type, each object is known as an element of the collection – In Java, the Collection is an Interface with three subtypes (Set, List, Queue) each of which is an Interface – Some of these collection types do not allow duplicate elements – Some collections allow you (or force you) to access elements based on a key value (e.g., access by ID number) – Some collections restrict access • • • • Location to insert Location to remove Inability to search the collection for a particular entry Inability to traverse the collection (without removing elements) • We generally think of a Collection as a list of items – the question is how we access the list Ordered List ADT • The List JCF classes allow inserting anywhere • Instead, we want an Ordered List ADT – insert only in the appropriate place based on ordering • You can remove any element as long as its removal maintains the ordering of the list – removal will be done by item, not index • Other methods are – traversal of the list (e.g., print all elements) – search for a particular element in the list – returning the element itself – destroy the list – obtain the list’s size – determine if the list is empty – determine if the list is full (for the array version) • As stated earlier, we can implement this using an array or a linked list, we will examine both Unordered List ADT • Another way to represent a List is without ordering – Why? It is easy (less computationally expensive) to add elements to an unordered list – Also, some lists do not necessarily require ordering (consider a shopping list, you usually do not alphabetize the items) • The problem with the unordered list is that search is expensive – We will see why later • The unordered list ADT has the same methods as the ordered list but may be implemented differently: – insert (always insert at the beginning or end for convenience), search, remove, traverse, destroy, size, isEmpty, isFull Array Implementation of Unordered List • Although we might want to use Generics to implement our lists, we cannot if we are implementing an array – for simplicity, we will assume an array of int values • We use the following three instance data – int[] list; – int size; – int number; // initialized to new int[100] // set to 100 // initialized to 0 • And the following methods • public boolean isFull( ) { return size==number;} • public boolean isEmpty( ) {return number==0;} • public void traverse( ) { for(int i=0;i<number;i++) System.out.println(list[i]); } • public void destroy( ) {number=0;size=0;list=null;} Insert • Since this is an unordered list, we can insert the new value anywhere • Where can we insert it that requires the least amount of work? • If we insert it anywhere from 0 to number-1 (that is, in a position already occupied), we have to move elements around • Instead, we will always insert at the end of the array at location number, and then increment number public void insert(int x) { if(number<size) { list[number]=x; If the array is full, we can output an error number++; message here, throw an exception (we would } want to add throws ExceptionType to the else … header), or do array doubling } Search and Delete • These methods will unfortunately require that we find the item through a sequential search • We will implement search first and use it for delete public int search(int x) { int location=-1; // indicates not found for(int i=0;i<number&&location==-1;i++) if(list[i]==x) location=I; return location; } public void delete(int x) We delete by replacing the item at index { with the last item in the array (at number-1) int index=search(x); and then decrement number if(index!=-1) { list[index]=list[number-1]; number--; } If not found, output error message or throw else … Exception } The Cost Of Sequential Search • As the item we are searching for can be anywhere in the array (or not in the array at all), we have to look at every value until we find it – We will search from the beginning onward (or we could search from the end inward) – Once we find the item, we stop searching – In the worst case, the item we are looking for is literally the last item in the list based on the direction of our search, or is not in the array at all – If we have an array of n items • in the best case, it takes 1 search • in the worst case, it takes n searches • in the average case, it takes (n+1)/2 searches, or roughly n/2 searches – Keep this in mind for later… Array Implementation of Ordered List • We will use the same instance data as with our Unordered List and the same initial methods – int[] list; – int size; – int number; // initialized to new int[100]; // set by constructor to 100 // initialized to 0 • public boolean isFull( ) { return size==number;} • public boolean isEmpty( ) {return number==0;} • public void traverse( ) { for(int i=0;i<number;i++) System.out.println(list[i]); } • public void destroy( ) {number=0;size=0;list=null;} public void insert(int x) { boolean done=false; for(int i=number;i>0&&!done;i--) if(x<list[i-1]) list[i]=list[i-1]; else { We insert x into its proper location by list[i]=x; finding the first element > x and move that number++; and all other elements down by 1 done=true; } if(!done) { Notice that we actually start from the right list[0]=x; end of the array and shift while we search number++; backward for the proper location } } B efo re inserting e at insertio n p o in t i 0 1 e0 e1 e A fter in serting e at insertio n p o in t i, list size is incre m e nted b y 1 … … i i+ 1 … k-1 e i-1 e i e i+ 1 … e k-1 e k 1 e0 e1 … … i e i-1 e d ata.length -1 … sh ift… Insertio n p o int 0 k i+ 1 i+ 2 … k ei … e k-1 e k e inserted here e i+ 1 k+ 1 d ata.length -1 Searching the Ordered List public int bsearch(int item) { int high=number-1,low=0,mid=(high+low)/2; while(low<=high&&list[mid]!=item) { We use binary search if(list[mid]>item) high=mid-1; else if(list[mid]<item) low=mid+1; for searching which is much more efficient mid=(high+low)/2; } than sequential search if(list[mid]==item) return mid; else return -9999; // use this to indicate item not found } Why binary search? We could do sequential search which means on average we have to look at n/2 items to find what we are looking for Binary search will require looking at no more than log n items to find the item we are looking for (or determine it isn’t in the list) Is the difference between n/2 and log n significant? Depends on n If n = 100, then it is 50 versus 7, not a big difference, but if n = 1,000,000 then the difference is 500,000 versus 20! public void delete(int x) { int temp=bsearch(x); if(temp==-9999) System.out.println("Item not found!"); else { for(int i=temp;i<number-;i++) list[i]=list[i+1]; number--; } } B efo re d eleting the ele m e nt at ind e x i 0 1 e0 e1 … … 0 1 e0 e1 … … We call upon binary search to find the location of the item to be deleted so that we don’t have to repeat the code here Deletion requires shifting to “cover up” the deleted item i i+ 1 … k-1 e i-1 e i e i+ 1 … e k-1 e k D elete this ele m e nt A fter d eletin g the ele m e nt, list size is d ecrem e nted b y 1 Deletion i e i-1 e i+ 1 … sh ift… … … k d ata.length -1 k-2 k-1 e k-1 e k d ata.length -1 Using Search for Delete • Notice in both of our implementations (ordered and unordered array list) we used the search method to obtain the index of an item we want to delete • Why should we use search for delete? – We don’t have to, but it saves us from having to re-implement the same code in another method • Could we also use the search method to identify the location to insert into (for the ordered list)? – We wouldn’t need to do this for the unordered list because we are always adding at the end • Notice in the ordered insert, we are searching from the end of the array forward until we find the proper insertion location – While we are searching, we are also shifting elements up the array to create a vacant spot – If we use the binary search to find a location for insert, we would still have to do all that shifting, so we may as well combine Conclusions on Array Implementations • We are familiar with arrays so it shouldn’t be challenging to implement • The unordered list’s insert and delete are easy but the ordered insert and delete, requiring shifting, is harder • Array doubling is necessary if we don’t want to arbitrarily limit the size of the collection – Array doubling is computationally expensive if you have to do it often – Without array doubling, we either have to resort to using a very large collection with the potential to waste space or limit the user program to a size that may be too small • As we will see, an array for an unordered list is about the same computationally as an unordered linked list • The savings in using an array over a linked list appears when we implement an ordered list Linked Lists • A linked list consists of nodes, one node per datum – Unlike the array where there may be many empty elements • A linked list node will be defined as some (1 or more) data member(s) and a next member which will store the next node in the list – The next field will be a reference variable (pointer) – The data member(s) can be Generic, unlike our array implementation • The linked list starts with a single front pointer (called head below) – We may or may not have a tail pointer • The linked list will require that we create a new Node to insert into the list and attach the Node via references head N ode 1 N ode 2 ele m e nt ele m e nt ne xt ne xt N ode n … ele m e nt null tail The Node • Before we define the Linked List more formally, let’s look at the Node – The Node will be defined as its own class (whether a standalone or nested class) • Any Node needs at least two instance data – The value that the Node is storing • a primitive datum like an int, an object like a String, an object that itself contains multiple values like a Student object, multiple data like a name, a major, a GPA, or a Generic type of Object – A reference to the next Node in the list • we will usually refer to this member as next and it will be of the same type as the Node itself, e.g., Node next; • this member is a pointer to the next Node but we don’t call them pointers in Java, so it is a reference to the next Node Why a Linked List • We will find that dealing with reference variables to insert and delete items from a list can be a challenge, so why bother? – We want an array of Integers, we declare Integer[ ] foo; – Later we do foo=new Integer[100]; • what happens if we need 101 Integers? • what happens if we only use 5 Integers? – The array stores a preset number of elements while the linked list contains 1 node for each element in the list so the linked list is more memory efficient (lower space complexity) – But the linked list suffers in search efficiency (time complexity) especially when we have a large list of items that are sorted • Another advantage of the Linked List is that the Node can be defined as Generic while an array cannot (although we could use an ArrayList to implement a Generic OrderedList class) Implementation Details of the LL • First we implement the Node – This will either be a separate class or an inner class – We will need to implement getters and setters for the data and the next members – We need to get and set next when we are dealing with getting the next node pointer or changing pointers • Note the use of <E> to make this generic – We must include in our outer class <E extends Comparable<E>> public class Node<E> { private E datum; private Node<E> next; public Node(E x, Node n) {datum=x;next=n;} public Node getNext() {return next;} public E getDatum() {return datum;} public void setNext(Node<E> n) {next=n;} public void setDatum(E x) {datum=x;} } Implementation Details of the LL • To create a linked list, set front to null (also tail to null if we have tail) – If we are keeping track of the size, set size to 0 – A list is empty if size is 0 (or front==null), the list is never full • To search an existing LL for some item temp, start at front and move through the list until we either find temp or reach null – Remember that E is Comparable public Node<E> search(E x) { Node<E> temp=front; while(temp!=null&&temp.getDatum().compareTo(x)!=0) temp=temp.getNext(); return temp; Once found, we can either } return the Node (search) or the index (search2) public int search2(E x) { int index=0; Node<E> temp=front; while(temp!=null&&temp.getDatum().compareTo(x)!=0) { temp=temp->next; index++; } if(temp==null) return -1; else return index; } Implementation Details of the LL public void insertFront(E x) { Node<E> temp=new Node(x,front); front=temp; h ead } Insert at the front of the list – we use this approach for our unordered list ADT only Traverse/print the list e0 tail … n ext A n ew n od e to b e in serted h ere ei e i+ 1 n ext n ext … ek n u ll elem en t n ext (a) B efore a n ew n od e is in serted . h ead tail elem en t e0 n ext n ext … ei e i+ 1 n ext n ext N ew n od e in serted h ere (b ) A fter a n ew n od e is in serted . public void traverse() { Node<E> temp=front; while(temp!=null) { System.out.println(temp.getDatum()); temp=temp.getNext(); } } … ek n u ll Implementation Details of the LL • Inserting in order requires finding the proper location to place the new Node • 3 cases – Empty list – create a new node and set front to it – Insert at the beginning – see previous slide – Anywhere else – search the list for the first Node whose datum > new value • For the third case, let’s assume we use the following code, is it correct? Node temp2=front; While(temp2!=null&&temp2.getDatum().compareTo(temp)<0) temp2=temp2.getNext(); temp.setNext(temp2); // but how do we attach temp to the list? Let’s Take a Closer Look • temp2 is pointing to the first Node whose datum > temp’s datum, so we know to insert temp before temp2 – We set temp’s next field to point at temp2, but we do not have a reference to temp2’s preceding node, how do we make the Node storing 63 point at temp? • We use 2 pointers to traverse the list, we will call them, current and previous – before moving current onto the next node, we set previous to be current – now when we reach 70, current is pointing at 70 but previous is pointing at 63, so we can modify 63’s next pointer to point at temp Implementing Ordered Insert public void insert(E x) { Node<E> temp=new Node<>(x,null); if(front==null) front=temp; else if(front.getDatum().compareTo(x)>0) { temp.setNext(front); Case 1: empty list front=temp; } Case 2: insert at beginning else { Node<E> current=front, previous=null; while(current!=null&¤t.getDatum().compareTo(x)<0) { previous=current; current=current.getNext(); } Case 3: search for proper previous.setNext(temp); location to insert temp temp.setNext(current); } } previous.setNext(temp); temp.setNext(current); What if current is null? This occurs if the place to insert is after the last Node in the list – this is not a special case, the same two instructions work Implementing Remove (Delete) • We have several versions of a deletion – Delete first element • if(front!=null) {front=front.getNext( );size--;} • else … // output a message, throw an exception – Delete by index for(current=front,i=0;current!=null&&i<n-1;i++) current=current.getNext( ); if(current!=null) current.setNext(current.getNext( )); – Delete a specific element, x • covered on the next slide Implementing Remove (Delete) • Again, we have three cases – If empty list, throw Exception or output error message – If item to be deleted is the first item in the list, redirect front to point at the second item – Otherwise, search the list using two pointers (we need to have a pointer to the previous Node) public void delete(E x) { if(front==null) // throw Exception or output message else if(front.getDatum().compareTo(x)==0) front=front.getNext(); // the first Node is no longer else { // pointed at, it will be Node<E> current=front; // garbage collected Node<E> previous=null; while(current!=null&¤t.getDatum().compareTo(x)!=0) previous=current; current=current.getNext(); } if(current!=null) previous.setNext(current.getNext()); } } { Comments • Notice the last if statement, we use this because if the item isn’t found, current will be null and we can’t perform a deletion – in the case of item not found or empty list, if we throw an Exception, we have to add throws Exception or use a trycatch block to handle the Exception • We can improve the efficiency by using current.getDatum().compareTo(x)<0 – how does this help? – if the item isn’t found, we stop searching as soon as we find the first datum greater than x – imagine we have the list 10, 20, 30, 40, 50, 60, 70, 80 and are looking to delete 15, we can discover 15 is not in the list as soon as we hit the Node with 20, but using the code on the previous slide will have us continue searching through the entire list Why Linked List Revisited • We cannot use binary search on a linked list – To find some datum in an ordered array takes O(log n) operations (worst case) while in the linked list, O(n) – We already saw that this was a significant difference when we compared the ordered and unordered lists in an array • To insert and delete in an ordered array, we have to shift elements up or down, as many as n elements • In the linked list, we have to find the proper location to insert or delete, as many as n elements – So we see that for an ordered list, the array and linked list versions give us the same worst case performances • The linked list gives us superior memory performance over the array because with the linked list, we do not need to use array doubling, nor do we waste memory space Which Should We Use? • If we know something about the use of the data, it might help us decide – How often are we inserting/deleting versus searching? – If searches are a minimum, the linked list might be better • consider a list where we spend much time adding and removing items, and traversing the entire list, but we rarely actually have to search for a single element in the list, this might be the case for instance for a teacher maintaining a list of students – On the other hand, what if we need to create a list and once created, rarely change it but spend a lot of time searching for particular elements? • You will learn alternate forms of lists in 364, notably trees and graphs – A tree typically is implemented using a linked structure and can be more efficient than either the ordered list as an array or linked list Linked List Variations • Dummy node – Create a linked list with a special dummy node (thus the list is never empty) where the dummy node is always first and contains a dummy value which will always be less than anything in the list – We will never insert before the first node so there are no special cases • Circular linked list – Only need one pointer, tail – To get to head, use tail.getNext( ) • Doubly linked list – Each node has two pointers, a previous and a next pointer – This requires modifying not only the linked list class but also the Node class – We no longer need a previous and a current pointer to work through the list, just current (we can get to previous by current.getPrevious( )) but it does require manipulating more pointers upon insert and delete Variations Illustrated head head N ode 1 N ode 2 ele m e nt ele m e nt ne xt ne xt ne xt N ode 1 N ode 2 N ode n ele m e nt ele m e nt ne xt ne xt null null pre vio us pre vio us N ode n … … ele m e nt ele m e nt tail tail Comparing Ordered Linked List vs Ordered Array Methods MyArrayList/ArrayList MyLinkedList/LinkedList add(index: int, e: E) O (1) O (n ) O (1) O (n ) clear() O (1) O (1) contains(e: E) O (n ) get(index: int) O (n ) O(log n) O(1) indexOf(e: E) O (n ) O (n ) isEmpty() O (1) O (1) lastIndexOf(e: E) O (n ) O (n ) remove(e: E) O (n ) O (n ) size() O (1) O (1) remove(index: int) O(n) O (n ) set(index: int, e: E) O(n) O (n ) addFirst(e: E) O(n) O (1) removeFirst() O(n) O (1) add(e: E) O(log n) O (n )