## CS 34 Applied Algorithms: Syllabus

Jan 25. Introduction. The gap between theory and practice.

**For Thurs:**Surf the web to see what you can learn about the following problems: Traveling Salesman Problem; Graph Coloring; Single-Pair Shortest Paths; Reconstructing Phylogenies. Save the links.**Assignment**for next week: Think about implementing an exhaustive search method for Graph Coloring, using the framework provided in class. We will discuss this on Thursday.

Feb 1. **Algorithm Engineering and Algorithm Design**. Tuesday we will survey algorithm engineering, algorithm tuning, and code tuning ideas. Thursday we will apply these ideas to the exhaustive search algorithm(s); and survey parsing and randomization tricks.

- For Tues
**:**JLB, Faster and faster yet. - For Tues: JLB on performance tuning
- Bader et al. on Engineering Algorithms for Optimal Phylogenies
- Skim: Schirazi on Java Performance Tuning
**Assignment**(start this week): Add branch-and-bound, algorithm tuning, and code tuning ideas to your exhaustive search implementation. Keep separate copies of original and tuned implementations.

**Feb 8. Tuning Algorithms + Analyzing efficiency ( What to measure)**. On Tuesday we will finish our discussion of algorithm tuning. On Thursday we will consider ways to measure algorithm and program efficiency.

- For Tues: No readings. We will finish our discussion of algorithm tuning.
- For Thursday: CCM on What to Measure.

Feb 8. **Analyzing program performance. **Tuesday lab work on Unix tools and tips on measuring algorithm performance. Thursday lab work on R, a data analysis package. The lab meets in 007 Seeley Mudd building (basement level).

- Tuesday Lab work using Unix timing tools and shell scripts, discussion of scaffolding.
- Thursday Lab work using R for data analysis.
**Assignment**(due March 8): Apply branch-and-bound, algorithm tuning, and code tuning, and memory-based tuning to your exhaustive search algorithm for graph coloring. Measure the results. Write a paper describing your results.

Feb 22. **What Statisticians Know**. Basics of experimental design and data analysis.

- For Tuesday: CCM on Experimental Design
- For Thursday: CCM et al. on finding asymptotic functions in data.

Mar 1.** Discussion of the class projects**: Schedule school bus routes for the Amherst Regional School System, and Schedule rooms (and courses) at Amherst College.

- Tuesday: A visitor from the Registrar's office to discuss course scheduling.
- Thursday: A visitor from IT to discuss online mapping tools.
**Assignment**: Due**April 29**; an implementation and (written) progress report are due**April 12**. Implement a solution to one of the application problems. Evaluate it on a well-chosen testbed of instances. Write a paper describing your results.

Mar 8. **Coping with NP-Hard problems**. We will discuss constructive and approximation algorithms for TSP and Graph Coloring, and decide on a handful of promising candidates for our applications. **Assignment: **Come to class prepared to explain at least one (two if you can) promising TSP and one GC algorithm. It should be a* constructive* algorithm. What are its merits (theoretically and/or experimentally), and why do you think it could work for our problems? Here are some resources to try.

You should understand the algorithm well enough to write pseudocode on the board, and explain why it would be a good possibility.

- Surf: Joseph Culberson's Graph Coloring Resources Page.
- Surf: Skiena's Stony Brook Algorithm Repository.
- Surf: DIMACS Implementation Challenges. The Second Challenge (1993) was on Graph Coloring (and two related problems). The Eighth Challenge (2001) was on TSP.
- Skim:
, by Applegate and Cook (in AC Science Library).*The Traveling Salesman Problem: A Computational Study* - Skim:
E.L. Lawler, editor (in AC Science Library).*The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization*, - Skim:
*The Traveling Salesman Problem and its Variations*, by Gutin and Punnen. Available electronically from Frost Library. **First Paper is DUE.****Tuesday**: We discuss algorithms and decide who's doing what.**Thursday**: A lecture on parsing.

Mar 15. Spring Break.

Mar 22. ** Heuristic search - Local Search**: Hill Climbing, Tabu Search, Simulated Annealing. On Tuesday we will discuss these heuristic approaches to problem solving. Assignment for Thursday: Read one research paper from the list (as assigned). Also read** Hooker** on experimental methodology. Be prepared to report on: (1) the problem analyzed in this paper; (2) how the heuristic is applied to the problem (what neighborhood, objecctive function, etc); (3) what other algorithms are being compared; (4) how well the heuristic performs; (5) do Hooker's criticisms apply to this research (evaluate the quality of the work).

**Tuesday**: Glover, ``Tabu Search Part I,''*ORSA J. Computing*, Vol 1, No 3, Summer 1989.**Tuesday**: Rutenbar, ``Simulated Annealing Algorithms: An Overview,'' IEEE Circuits and Devices, January 1989 pp20-28.**Thursday**: J. Hooker, ``Testing Heuristics: We have it all wrong,''*Journal of Heuristics*, 1995, vol 1, pp 33-42.Thursday

**Dan**: Laarhoven, Aarts, Lenstra, ``Job shop scheduling by Simulated Annealing'''Thursday

**John**: Anagstopolos, and van Hentenryck, ``A Simulated Annealing approach to traveling tournament problem,''Thursday

**Pau**l: Gendreau, Hertz, Laport, ``A Tabu Search Heuristic for the Vehicle Routing Problem,''Thursday

**Andrey**: Dowsland, Soubeiga, Burke, ``A simulated annealing based hyperheuristic for determining shipper sizes for storage and transportation,'' European Journal of Operational Research, 179 (2007) pp 759-774.Thursday

**Kush**: Grabowski and Pempera, ``The permutation flow shop problem with blocking. A Tabu search approach,''*Omega*35 (2007) pp 302--311.

Mar 29. **Other Strategies for Optimization Problems**: Tuesday surveys: Hybrid metaheuristics, evolutionary algorithms, Linear Programming and Integer Programming approaches. Thursday: examples with experimental work (the Lagrangian Relaxation paper is a survey/tutorial).

Tuesday: Raidl, ``A unified view on hybrid metaheuristics,''

*Hybrid Metaheuristics, Springer LNCS*, Vol 4030/2006.

- Tuesday: A Tutorial on Integer Programming (online). The OR faculty of CMU Graduate School in Industrial Administration, Summer 1997. Read these sections: Introduction, Modelling with Integer Variables. Skim: Solving Integer Programs.
Thursday 1: Goncalves, Mendes, Resende, ``A hybrid genetic algorithm for the job shop scheduling [roblem,''

*European Journal of Computing.*Thursday 2: Roth and Yih, ``A linear programming formulation for global inference in neural language tasks,''

*Proceedings of the 8th Conf. on Natural Language Learning (CoNLL-04*), May 2004.Thursday 3: Mendez-Diaz Zabala, ``A cutting plane algorithm for graph coloring,''

*Discrete Applied Mathematics*, 156 (2008), pp 159-179. Ok to skip Section 3 of this paper.Thursday 4: Mark Reiman, ``Guiding ACO by problem relaxation: a case study on the symmetric TSP,''

*Hybrid Metaheuristics*, LNCS 4771, pp 45-55, 2007. (handout in class)Thursday 5: Fischer, Merz, ``A memetic algorithm for the optimum communication spanning tree problem,''

*Hybrid Metaheuristics*, LNCS 4771, pp 170-184, 2007.

Apr 5. **Constraint satisfaction problems + Progress Reports. **

Tuesday: Vipin Kumar, ``Algorithms for constraint-satisfaction problems: a survey,'' AI Magazine, Vol 13, No 1, 1992.

Tuesday: Malik and Zhang, ``Boolean Satisfiability: from theoretical hardness to practical success,''

*CACM*August 2009.Tuesday: Lynce and Ouaknine, ``Sudoku as a SAT Problem,''

*Proceedings of 9th AIMATH,*2006.Thursday: Progress reports from students. Goal: you should have input/output routines and a hopefully a skeleton solver up and running.

Apr 12. **Memory-efficient algorithmics**. Cache-efficient and I/O efficient strategies.

Tues

**Paul**: LaMarca and Ladner, ``The influence of caches on the performance of heaps'', JEA 1997. ( Ok to skim the middle part with theorems. )Tues

**John**: Ladner, Fortna, Nguyen, ``A comparison of cache-aware and cache-oblivious static search trees using program instrumentation.,'' Fleischer et al (eds),*Experimental Algorithmics,*LNCS 2545, 2002. pp 78-92.Tues

**Andrey**: Lam, Rothberg, Wolf, ``The cache performance and optimizations of blocked algorithms,'', April 1991. Proceedings, 4th ASPOLOSTues

**Kush**: Brodal, Fagerverg, Vinther, ``Engineering a cache-oblivious sorting algorithm, '' JEA V12 No 2.2, June 2008.Tues

**Dan:**Frias, Petit, Roura, ``Lists revisited: cache conscious STL lists,''*Experimental Algorithms:*WEA 2006, Carme Alvarez and Maria Sernas, eds. LNCS 4007.Thurs: For good survey and background information, see Vitter, ``External memory algorithms and data structures: dealing with massive data, '' ACM Computing Surveys 2001.

Thurs

**Kush**: Ajwani, Meyer, Osipov``Improved external memory BFS implementations,'' ALENEX 2007.Thurs

**Dan**: Rahn, Sanders, Singler, ``Scalable distributed-memory external sorting, '' CoRR 2009.Thurs

**John**: Yannis and Zobel, ``Compression techniques for fast external sorting,'' VLDB Journal V16 No2, 2007. pp 269-291.Thurs

**Paul**: Arge, Toma, Vitter, ``I/O efficient algorithms for problems on grid-based terrains,'' ALENEX 2000.Thurs

**Andrey**: Meyer, Osipov, ``Design and implementation of a practical I/O-efficient shortest paths algorithm,'' ALENEX 2009.

Apr 19. ** I/O - Efficient Algorithms and Massive Data Sets**: Tuesday: student reports on last week's I/O-efficient algorithms. Thursday: problems on streaming data, everybody skims the three papers below.

Tuesday: I/O efficient algorithms (from last week).

- Thursday: Andrey presents Meyer,Osipov.
Thursday: Dean, Ghemawat, ``MapReduce: Simplified Data Processing on Large Clusters. ''

Thursday: Tsourakakis, et al. ``Doulion: Counting triangles in massive graphs with a coin,' '

*ACM KDD*June/July 2009.Thursday: Kumar et al, ``Data Streaming Algorithms for Efficient and Accurate Estimation of Flow Size Distribution,''

*SIGMETRICS/Performance '04*, 2004.

Apr 26. ** Bioinformatics: Sequence alignment and phylogeny reconstruction. **

- Tuesday: Lecture on the terminology and problems of sequence alignment and phylogeny reconstruction.
- Th: Kuhner and Felsenstein, `
**`**Simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates,'' Mol. Bio. Evol. 11(3) pp 439-468. - Th: Leaver-Fay, et al., ``Faster placement of hydrogens in protien structures by dynamic programming,'' JEA V12 No 2.4 June 2008.
- Th: Swenson et al, ``Approximating the true evolutionary distance between two genomes,'' JEA V12 N 3.5, August 2008.
- Th: Wan et al, ``Phylogenetic methods for the analysis of genome rearrangmenet data: an empirical study.''
- Th: Doring et al, ``SeqAn: An efficient generic C++ library for sequence analysis,''
*BMC Bioinformatics*, 2008, 9:11. - Th: Sundquist et al, ``Whole-Genome sequncing and assembly with high-throughput, short-read technologies,
*PLoS ONE*www.plosone.org, May 30, 2007

May 3. Final Project reports.

**Papers due; reports in class.**