University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1996. Lecturer: F.D. Murtagh. 1. Introduction. ---------------- 1.1 Motivation and Rationale. ---------------------------- Digital image processing and digital signal processing -- the course covers both -- are amongst the fastest growing computer technologies of the 1990s. With increasing computing power, it is increasingly possible to do numerically many tasks that were previously done using analogue techniques. More importantly, it is now feasible to perform processing on signals and images that were previously unthinkable. There are related advances in performance and cost of sensors: a $30,000, three stone TV camera of 20 years ago, now retails at $300 and fits in the palm of your hand; likewise the capacity of communications channels and storage devices. Digital television will arrive in 1997 or so; we routinely transfer images over Internet / World-Wide-Web. It is now as valid to think of an image as data for processing just as a column of numbers in a bank balance, or strings of text in a database. Digital image processing is now state-of-the-art in many areas of high-technology: industrial inspection, monitoring of the earth and weather forecasting, document handling; and in areas of low(ish)-technology: personal computers and multimedia, digital cameras and electronic darkrooms. Although some image processing requires special architectures (eg. the real-time processing of TV images - 30 per second), much practical work can be done on a general purpose computer - even a modest PC. Thus, it is as valid to think of a picture as data for processing as a column of numbers in a bank balance, or strings of text in a database. There is an active Signal and Image Processing Research Group at Magee College -- around 10 researchers. We offer PhD programs in the field. The principal objective of this unit is to lay a foundation for further study and research in this field. 1.2 Objectives: --------------- On completion of this unit the student should be able to: - understand the fundamentals of digital image representation and the elements of a digital image processing system. - be familiar with a simple model of visual perception. - describe and apply image enhancement techniques. - understand the fundamental concepts of one-dimensional signal processing, and its link with image processing. - understand the concepts and applications of image transforms. - describe and apply image data compression techniques. - appreciate the concept of data compression applied to text, signals, and images. - describe and apply image segmentation techniques. - understand the fundamentals of pattern recognition. - be aware of current advances in, and applications of, image processing. 1.3 Syllabus ------------- A. Introduction and Fundamentals. 15% --------------------------------- Scenes, images, digital images. Concepts of image processing. Relationship with signal processing. Sample applications. Visual perception. An image model. Sampling and quantisation. Imaging geometry. Colour. B. Mathematical Preliminaries. 5% ------------------------------ Linear algebra, vectors and matrices. Complex numbers. Probability, random variables. C. Image Enhancement. 15% --------------------- Concept of image enhancement. Spatial operations, point operations. Noise and degradation. Point operations. Grey scale transformations. Lookup table. Colour coding. Thresholding. Histogram modification. Global enhancement, local enhancement. Averaging of multiple images. Spatial operations. Smoothing, sharpening. Smoothing. Averaging. Lowpass filtering. Median filtering. Other non-linear filtering. Sharpening, edge enhancement. Gradients and differentiation. Common gradient masks. Highpass filtering. Marr-Hildreth theory. Edge detection. Image restoration. D. One-Dimensional Signal Processing. 10% ------------------------------------- Signals, waveforms, digital signals. Sampling and quantisation. Frequency, phase. Periodic signals. Fourier series. Orthogonal functions. Finite Fourier series. The Fourier transform. Discrete Fourier transform and FFT. One-dimensional convolution, correlation. Filtering. Spectrum analysis, frequency discrimination. Other linear transforms, e.g. Hadamard, Cosine. E. Transforms in Image Processing. 10% --------------------------------- Two-dimensional discrete Fourier transform. Separable transforms. Two-dimensional convolution, correlation. Other image transforms. Applications, e.g. fast correlation, convolution, data compression, filtering, image restoration. F. Segmentation. 10% ---------------- Concepts: single pixel classification, boundary based methods, region based methods. Single pixel classification. Thresholding. Local, global. Adaptation to multispectral images, clustering. Boundary based methods, edge detection. Region growing methods. Aggregation, splitting and merging. G. Image Data Compression. 13% -------------------------- Concepts and context. Information and information transmission. Entropy, redundancy, correlation. Sub-divisions: error free, controlled degradation. Major techniques: transform, predictive, source coding, quantisation, ad hoc structural, image model coding, knowledge based. Current developments: JPEG, MPEG. H. Pattern Recognition. 15% ----------------------- Concepts: features, invariant features, similarity measures, classifiers. Feature extraction. Classifiers and classifier training. Classification rules: geometric, statistical, template matching and correlation. Applications, e.g. shape recognition, spectral pattern recognition. Current developments, e.g. neural nets, fuzzy systems. Alternative methods: knowledge based, structural. I. Advances in Image Handling and Processing. 7% --------------------------------------------- Still images. Video images. Machine vision. Document image processing. Industrial inspection. 1.4 Recommended Texts and Indicative Reading. -------------------------------------------- If the module is taught by J.G. Campbell, you will get a complete set of notes written by him - these form the 'essential' reading for the course; they will give a comprehensive coverage; however, as this is a final year course, you will need to do background reading to supplement. Of the books listed below, Niblack is by far the easiest introduction to the subject - you could teach yourself from this book; Gonzalez and Woods is probably the closest treatment to my notes (the original version of my notes were based partly on Gonzalez and Wintz 2nd Ed.). Bassman and Besslich is down to earth and practical. However, Rosenfeld and Kak is still the best overall treatment of image processing - the most complete and correct; this book comes in two volumes and is expensive - about 75 pounds for the two volumes. Castleman (1996) is very good. Pratt (Digital Image Processing) is recent and very complete but difficult and mathematical in places; also expensive. As I said, Bassman and Besslich is very good and practical -- plenty of software examples -- and comes with a disk, also Pitas and Parker for software. Van der Heijden has some good practical stuff. Key: OWL - one-week loan ***new for 1998 @Book{cox-flbi, author = "E. Cox", title = "Fuzzy Logic for Business and Industry", publisher = "Charles River Media", year = "1995", OPTeditor = "", OPTvolume = "", OPTedition = "", OPTannote = "" } @Book{jang, author = "J-S.R. Jang and C-T. Sun and E. Mizutani", title = "Neuro-fuzzy and Soft Computing", publisher = "Prentice Hall", year = "1997", OPTeditor = "", OPTvolume = "", OPTedition = "", OPTannote = "" } @Book{davies-mv, author = "E.R. Davies", title = "Machine Vision: Theory, Algorithms, Practicalities", publisher = "Academic Press", year = "1997", OPTeditor = "", OPTvolume = "", OPTedition = "", OPTannote = "" } @Book{campbell-irs, author = "James B. Campbell", title = "Introduction to Remote Sensing", publisher = "Guilford Press", year = "1997", OPTeditor = "", OPTvolume = "", OPTedition = "2nd", OPTannote = "" } @Book{cressie, author = "N.A. Cressie", title = "Statistics for Spatial Data", publisher = "Wiley-Interscience", year = "1993", OPTeditor = "", OPTvolume = "", OPTedition = "Revised", OPTannote = "" } @Book{devlin-goodbye, author = "K.J. Devlin", title = "Goodbye, Descartes: The End of Logic and the Search for a New Cosmology of the Mind", publisher = "John Wiley", year = "1997", OPTeditor = "", OPTvolume = "", OPTedition = "", OPTannote = "" } @Book{lyon-rao, author = "D.A. Lyon and H.V. Rao", title = "Java Digital Signal Processing", publisher = "M&T Books", year = "1998", OPTedition = "", OPTannote = "" } @Book{devlin-patterns, author = "K.J. Devlin", title = "Mathematics: the science of patterns", publisher = "Scientific American", year = "1997", OPTedition = "", OPTannote = "" } @Book{harvey-forecasting, author = "A.C. Harvey", title = "Forecasting, structural time series models and the Kalman filter", publisher = "Cambridge University Press", year = "1990", OPTeditor = "", OPTvolume = "", OPTedition = "", OPTannote = "" } @Book{luger-stubblefield-3e, author = "G.F. Luger and W.A. Stubblefield", title = "Artificial Intelligence 3rd ed.", publisher = "Addison Wesley", year = "1997", OPTedition = "", OPTannote = "" } @Book{nilsson-new, author = "N.J. Nilsson", title = "Artificial Intelligence: a new synthesis", publisher = "Morgan Kaufmann", year = "1998", OPTedition = "", OPTannote = "" } @Book{mcclellan-etal, author = "J.H. McClellan et al", title = "Computer Based Exercises for Signal Processing with MATLAB 5", publisher = "Prentice Hall", year = "1998", OPTedition = "", OPTannote = " note new edition" } @Book{pratt-va, author = "W.K. Pratt", title = "Developing Visual Applications, XIL: An Imaging Foundation Library", publisher = "Sun Microsytems/Prentice hall", year = "1997", OPTedition = "", OPTannote = "" } @Book{pedrycz-gomide, author = "W. Pedrycz and F. Gomide", title = "An Introduction to Fuzzy Sets", publisher = "MIT Press", year = "1998", OPTedition = "", OPTannote = "" } @Book{starck-murtagh=bijaoui, author = "J-L. Starck and F. Murtagh and A. Bijouai", title = "Image Processing and Data Analysis", publisher = "Cambridge University Press", year = "1998" OPTannote = " 2 copies" } @Book{brigham-fft, author = "E.O. Brigham", title = "The Fast Fourier Transform and its Applications", publisher = "Prentice Hall", year = "1988", OPTedition = "", OPTannote = "" } @Book{brown-hwang-3e, author = "R.G. Brown and P.Y.C. Hwang", title = "Introduction to Random Signals and Applied Kalman Filtering: with MATLAB Exercises and Solutions 3rd ed.", publisher = "John Wiley", year = "1998" } @Book{lindley, author = "C. Lindley", title = "Photographic Imaging Techniques in C++", publisher = "John Wiley", year = "1995" } ** new for 1997 ** Bruce Batchelor and Paul Whelan Intelligent Vision Systems for Industry Springer ISBN 3-540-10060-1 1997 2 copies -- both OWL ** J.R. Parker Algorithms for Image Processing and Computer Vision John Wiley ISBN 0-471-14056-2 1997 2 copies -- both OWL ** R. Jain and R. Kasturi and B. Schunck Machine Vision McGraw-Hill ISBN 0-07-032018-7 1995 1 copy -- OWL ** Ulf Grenander Elements of Pattern Theory The Johns Hopkins University Press ISBN 0-8018-5188-2 1996 1 copy -- OWL ** M. Nelson The Data Compression Book ***2nd ed*** MIS Press 1996 1 OWL K.R. Castleman Digital Image Processing Prentice Hall 1996 3 copies 2 of which OWL C.M. Bishop Neural Networks for Pattern Recognition Oxford University Press 1995 2 copies M. Sonka, V. Hlavac, and R. Boyle Image Processing, Analysis and Machine Vision Chapman and Hall, 1993. 2 copies B.D. Ripley Pattern Recognition and Neural Networks Cambridge University Press 1996 2 copies H. Bassmann and P.W. Besslich Ad Oculos -- Digital Image Processing International Thompson 1995 2 copies I. Pitas Digital Image Processing Algorithms Prentice Hall 1993 1 OWL J.R. Parker Practical Computer Vision Using C Wiley 1994 1 OWL R.J. Schalkoff Digital Image Processing and Computer Vision Wiley 1989 2 OWL F. van der Heijden Image Based Measurement Systems Wiley 1994 1 OWL R.N. Bracewell Two-dimensional Imaging Prentice Hall 1995 1 OWL R.C. Gonzalez and R.E. Woods Digital Image Processing Addison-Wesley 1992. (effectively 3rd ed. of Gonzalez and Wintz). 1 OWL, 2 SL W. Niblack An Introduction to Digital Image Processing 2nd Ed. Prentice-Hall 1986 1 OWL, 1 SL A. Rosenfeld and A.C. Kak Digital Picture Processing - Vol. 1 Academic Press 1982 1 OWL, 1 SL A. Rosenfeld and A.C. Kak Digital Picture Processing - Vol. 2 Academic Press 1982 1 OWL, 1 SL W.K. Pratt Digital Image Processing 2nd ed. Wiley 1991 1 OWL, 1 SL R.D. Boyle and R.C. Thomas Computer Vision: A First Course Blackwell Scientific 1988 1 OWL, 1 SL A.K. Jain Fundamentals of Digital Image Processing Prentice-Hall Int. 1989. 1 OWL, 1 SL T.M. Lillesand and R.W. Kiefer Remote Sensing and Image Interpretation 2nd. ed. John Wiley 1987. 1 OWL, 1 SL Teuber, J. Digital Image Processing Prentice-Hall 1993 OWL R.M. Haralick and L.G. Shapiro Computer and Robot Vision Volume 1. Addison-Wesley 1992. OWL R.M. Haralick and L.G. Shapiro Computer and Robot Vision Volume 2. Addison-Wesley 1993. OWL J.S. Lim Two-Dimensional Signal and Image Processing Prentice-Hall 1990 OWL K.S. Fu, R.C. Gonzalez, and C.S.G. Lee Robotics - Control, Sensing, Vision, and Intelligence McGraw-Hill 1987 OWL R.O. Duda and P.E. Hart Pattern Classification and Scene Analysis Wiley 1973 1 SL, 1 OWL D.H. Ballard and C.M. Brown Computer Vision Prentice-Hall 1982 1 SL, 1 OWL D. Vernon Machine Vision Prentice-Hall 1991 1 SL, 1 OWL A. Low Introductory Computer Vision and Image Processing McGraw-Hill 1991 1 OWL, 1 SL T. Masters Practical Neural Network Recipes in C++ London: Academic Press. 1993. OWL T. Masters Signal and Image Processing with Neural Networks: A C++ Sourcebook John Wiley, 1994 OWL T. Masters Advanced Algorithms for Neural Networks: a C++ sourcebook John Wiley 1995 OWL T. Masters Neural, Novel and Hybrid Algorithms for Time Series Prediction John Wiley 1995 OWL S.M. Bozic Digital and Kalman Filtering, 2nd ed. Edward Arnold, 1994 OWL J. Hertz, A. Krogh and R.G. Palmer Introduction to the Theory of Neural Computation Addison-Wesley, 1991 OWL A.V. Oppenheim and R.W. Schafer Discrete Time Signal Processing Prentice-Hall 1989 1 SL, 1 OWL A.V. Oppenheim and R.W. Schafer Digital Signal Processing Prentice-Hall 1975 1 SL, 1 OWL S. Chang Principles of Pictorial Information Systems Prentice-Hall 1989 SL C.W. Therrien Decision Estimation and Classification John Wiley 1989 1 OWL, 1 SL R. Kasturi and R. Jain Computer Vision: Principles IEEE 1991 A collection of some key papers. OWL R. Kasturi and R. Jain Computer Vision: Advances and Applications IEEE 1991 Another collection of pepers. OWL R. Chellappa Digital Image Processing 2nd ed. IEEE Press 1991 OWL P.A. Lynn and W. Fuerst Introductory Digital Signal Processing with Computer Applications John Wiley and Sons 1989 1 SL, 1 OWL Schalkoff, R. Pattern Recognition: Statistical, Structural and Neural Approaches Wiley 1992 1 SL, 1 OWL Szymanski, J.E. Basic Mathematics for Electronic Engineers Van Nostrand Reinhold 1989 OWL Winston, P.H. Artificial Intelligence *3rd ed.* Addison-Wesley 1992 1 SL, 1 OWL Hecht-Nielsen, R. Neurocomputing Addison-Wesley 1992 OWL Fukunaga, K. Introduction to Statistical Pattern Recognition *2nd ed.* Academic Press 1992. OWL Wasserman, P.D. Neural Computing: Theory and Practice Van Nostrand 1989. 1 SL, 1 OWL Wasserman, P.D. Advanced Methods in Neural Computing Van Nostrand 1993 OWL M. Nelson The Data Compression Book MIS Press 1991 1 SL, 1 OWL A.D. Kulkarni Artificial Neural Networks for Image Understanding Van Nostrand Reinhold, 1994. OWL 1.5 Applications. ----------------- Throughout the course we will continuously refer to the applications of image processing technology. Some of the areas we shall touch are: (a) Machine Vision. ------------------- How to get a machine to sense a scene and perform the perception, recognition, and knowledge acquisition tasks that are routine for human observers. Broadly speaking there are two important sub-areas of machine vision: - 3-dimensional scene analysis, e.g. for automatic vehicle navigation. Difficult, except in very limited domains. Still a research area. - automatic / automated inspection, e.g. quality control of computer printed circuit boards, or of pastry cases, metal parts, etc... The Signal and Image Processing Group at Magee is doing research into flaw detection in textiles. Here the world is essentially two dimensional. Now state of the art technology. (b) Character recognition. -------------------------- The grand-daddy of all image processing / pattern recognition tasks. How to convert ink marks on a page into text characters. Similar technology can be used for recognising any planar shape, e.g. for a robot to pick from a selection of parts on a conveyer belt. Printed (block) character recognition is more or less state-of-the-art; cursive, and handwritten script much more difficult - still a research area. (c) Medical. ------------ - blood cell analysis; looking for abnormal shapes, or abnormal proportions of shapes. - computer-aided tomography. Construction of 3-d. 'image' from a set of 2-d X-ray images of cross-sections. - automatic screening of chest X-ray images. - teleradiology and telemedecine, i.e. enabling a specialist to medical images and other measurements of a patient while they (doctor and patient) are sepatated by large distances. (d) Remote Sensing ------------------ Images of the earth, sensed from satellites and aircraft, can be processed to: - assist in weather forecasting. Of course, fairly 'raw' unprocessed images are routinely used in weather processing, see the TV at 9.30pm any night. - automatically produce land-use maps, - mineral exploration, - evaluate the extent of global warming, - earth's radiation budget studies. - pollution monitoring. - some of the earliest digital image processing work was done at the Jet Propulsion Laboratory, California, to 'clean-up' images sent from deep space probes (eg. Venus). (e) Military. ------------- - automatic guidance of heat seeking missiles; images are infra-red, - interpreting remotely sensed images from spy satellites and aircraft, - determination if 'friend' aircraft, from 'foe' using, e.g., silhouette images, - etc... Previously military applications were the primary instigators of most electronics related research and development, and this was especially so in image science. Not so important now that the cold- war is over. (f) Document Image Processing ----------------------------- Increasingly, business records (letters, balance sheets, etc...) are prepared on computer, and stored on them, i.e. as strings of digits, alphabetic characters, even computer-aided-design drawing codes. But, certain types of document originate outside the computer, e.g. cheques, delivery documents with receipt signatures, or are difficult to convert into symbols, so that the best that can be done is to make a digital picture record of them, and store that on a computer. (g) Entertainment and Consumer Items ------------------------------------ - digital video, - still images on computers, - digital cameras, etc... (h) Geographical Information Systems ------------------------------------ The combination of many different sorts of spatial information in one data-base, e.g. Ordnance Survey map, satellite image (see above), census data, gas and electricity mains, geology maps etc... As in general artificial intelligence, we shall find the irony that many of the image processing tasks that humans regard as routine ('childs-play') are difficult for machines. Fortunately, the reverse is true: some processing tasks that appear impossible to human eyes and brains, that require enormous attention to detail, that are boring and repetitive, or have to be done in hostile environments, can be automated quite simply. 1.6 What is a digital image? ---------------------------- Image. ------ As used in most of these lectures, the term image or, strictly, monochrome image, refers to a two-dimensional brightness function f(x,y), where x and y denote spatial coordinates, and the value of f at any point (x,y) gives the brightness (or, grey level) at that point. Monochrome versus colour. ------------------------- Mostly we shall deal with monochrome ('black-and-white') images - i.e. f(,) is a grey level. In a colour image f(,) gives a colour. A colour image can be represented by three mono. images, each representing the intensity of a primary colour (eg. red, green, blue). Thus, fr(x,y), fg(x,y), fb(x,y). More usefully, a colour image is represented by f(b,x,y), where b denotes colour (b = band), where band = 0, 1, or 2, for red, green blue. Digital? -------- The monochrome image, f(x,y), mentioned above is continuous (or analogue in some parlance), in TWO senses: - f(.,.) is a real number, and, - x, and y are real numbers. Thus, you can achieve infinitessimally fine resolution in f(.,.), x, and y. As always in computers we must use 'digital' or 'discrete' approximations. We approximate f(.,.) by restricting it to a discrete set of grey levels (often, in image processing systems, an 8-bit integer {0..255}, AND, we sample f(.,.) at discrete points xi, i=0..n-1, and yj, j=0..m-1, see Figure 1.6-1. 0 ymax 0 m-1 +--------------------> y +---------------------> j (c) 0 | . . 0 | fd(1,1) fd(1,2) | x1,y1 x2,y1 | | | | . | fd(i,j) | xi,yj i | | (r)| | | xmax| n-1| fd(n-1,m-1) V V Continuous image: Discrete: domain of f(,) is domain of fd(,) is [0,xmax] x [0,ymax] {0..n-1} x {0..m-1} ie. an n-row x m-column image (n x m) [To be consistent with most textbooks and published papers, I have transposed the normal usage of x- and y-axes.] Also, to remove any ambiguity about which axis each index (i,j) refers to, we tend to use f(r,c), where r = row, c= column. Figure 1.6-1 Correspondence of continuous and discrete axes. ------------------------------------------------------------ Thus, we arrive at a digital image: fd(r,c) where fd can take on discrete values {0..G-1} and r, {0..n-1}, c, {0..m-1} which can be viewed as a matrix (or two-dimensional array) of numbers: |fd(0,0) fd(0,1) ... fd(0,m-1) | |fd(1,0) fd(1,1) ... fd(1,m-1) | | | fd(i,j)= | | Eq. 1.6-1 | | |fd(n-1,0) fd(n-1,1) ... fd(n-1,m-1)| (from now on we will drop the d, and use f(r,c) ) For the convenience of certain algorithms, or due to the hardware configuration, digital images are often square (m=n). As well, (again for convenience of representation in digital computers) n is usually a power of 2: 16, 32, 64, 128...1024, etc. Since the grey level value will usually be stored in an integer computer word, G, also, will be a power of two; though, there is no reason not to use floating point or some other numerical representation. Ex. 1.6-1. For a monochrome image, G = 255, n= 1024, and m= 1024, how many bytes will the image occupy? Ex. 1.6-2. Repeat Ex. 1.6-1 for a colour image. Ex. 1.6-3. If you had to digitise a TV image, suggest suitable values of m, n. Ex. 1.6-4. Using the results of Ex. 1.6-3, and assuming 25 frames (images) per second, how much data for a one hour film ? Other examples of digital quantities. ------------------------------------- Music on tape, or vinyl LP is continuous. Music on CD is digital. CD sampling rate is 44,100 samples per second, 12-bits per sample, 2 channels (stereo). In modern telephone systems, speech is transferred digitally between major exchanges - here you can get away with 8,000 samples per second, and 8-bits per sample. Raster sampling or scanning. ---------------------------- The image model given above corresponds to the image model used in raster graphics, i.e. the image is formed by regular sampling of the x-, and y-axes. Pixel. ------ Each f(r,c) in eq. 1.6-1 is a picture element or pixel, (the equivalent term 'pel' is used in some texts). Spatial Resolution. ------------------- Spatial resolution (or just resolution) is HIGH if the samples xi, yj are closely spaced, and is low if they are widely spaced. Clearly, the closer the spacing, the more alike the digital image will be to the original, i.e. we are always demanding higher resolution. On the other hand, the higher the resolution, the larger are m, n - more data; data volume grows as the square of the resolution. Ex. 1.6-5. Laser printers commonly work at 300 dots per inch. How many pixels in an A4 page. Ex. 1.6-6. How many bits per pixel (ie. what is G) for a laser printer image? Ex. 1.6-7. (a) See Exs 1.6-5, 1.6-6, how much data in a laser printer image of an A4 page? (b) Compare with the data volume in an A4 page stored as ASCII text. Ex. 1.6-8. The ground resolution of weather satellites is about 1 to 5 Kilometers. The ground resolution for some spy satellites is reported to be 15 centimeters. (a) A satellite image has a ground resolution of 15 cm. What does a ground resolution of 15 cm mean? (b) Using such an image, could a skilled interpreter (i) read the number plate of a car? (ii) determine the type of a car? (iii) count the number of cars in a car park? (iv) evaluate how many people at a football match? (v) tell where the football was at the time the image was taken? Ex. 1.6-9. Assuming monochrome, ground resolution of 15 cm. and 8 bits per pixel, how much data in an image of the Derry area? Ex. 1.6-10. You are designing an image processing system to do quality control of lace fabric. The structure of the lace is such that thread separation is 0.5 mm. Suggest a suitable sampling resolution (in millimeters). The fabric is produced in 1.8 metres wide rolls, and it passes the inspection point at 1 metre per second). How many pixels per second? What does this suggest about the type of processor required? Ex. 1.6-11. Take some measurements and suggest a resolution for flaw detection in denim (jean) fabric. Grey-Level Resolution. ---------------------- With proper selection of the digitisation range, it is usually possible to represent, without any humanly perceivable degradation, monochrome images using just 8-bits; the psychologists tell us that humans can perceive no more than 160 levels at once. Also, in my experience, there appears to be some natural law that says that most image sensors cannot deliver any useful higher amplitude resolution. 1.7 Signal Processing. --------------------- Signal processing - and digital signal processing - are closely related to image processing; wheras images are two-dimensional, signals are one-dimensional. A music signal coming from a microphone, or going to a loudspeaker, is a continuous voltage signal - a continuous function of time: f(t). A DIGITAL SIGNAL is a sequence of numbers: f0, f1, ... fn-1 where f0 may represent the (digitised) voltage at t=t0 f1 ... at t1. i.e. the function is sampled (and digitised) at t0, t1, t2, ... Ex. 1.7-1. Assuming 12-bits per sample, 44.1K samples per second (44.1 KiloHertz, KHz), 2 channels, how much data on a 1 hour CD? Compare this to the result obtained in Ex. 1.6-4 (one hour of video). Ex. 1.7-2. What is the data rate available on a normal 'Kermit' line connected to the VAX? Would it be possible to send digitised speech or CD music over it? Ex. 1.7-3. What is the data rate available on a normal BT modem line? Would it be possible to send speech or CD music over it? If you cannot send digital speech over that line, can you see any paradox? We shall cover some aspects of signal processing because some processing techniques are more easily understood in one-dimension; and they are easily transferred to two-d. 1.8 Colour and Spectral Bands, Sensing -------------------------------------- As mentioned earlier, a colour image can be represented by three monochrome images or bands. But, image sensors are not just limited to visible light; some remotely sensed images are made up of 10 or more spectral bands. Sensing. -------- To produce f(x,y) (forget about digitisation for a moment) from a scene, sensing must take place. That is, the brightness of each point (x,y), in the field-of-view (FOV) of the observer, must be measured. In a practical system this can be considered to be accomplished by passing a small aperture (opening) over the field of view, stopping when the aperture centre is over the discrete sample point (xi,yj) and taking the average brightness within the aperture. Usually, the width and height of the aperture are about the same as the horizontal and vertical sampling periods, respectively. 1.9 General concepts of image processing. ----------------------------------------- The general concept of a process involving image processing may be seen in Figure 1.9-1. The flow of information is from right (the original scene) which reflects light into the the sensor; the sensor converts light into a voltage; the voltage (a continuous quantity) is sampled and digitised to yield a number; some (numerical/digital) image processing is done; the numbers must be converted back into a voltage and transferred to a display which procuces patters of light that the user can see. Quantities used ----------------------------------------------------- Light Voltage Numbers Numbers/volts Light Scene----->Sens------>Digit------->Image------->Disp----->User's -or -ise Processing -lay Eyes or Printer--Hard Copy Picture or Store on file or Transmit Figure 1.9-1 Overview of Imaging and Image Processing ----------------------------------------------------- Typical image processing operations are: - smooth out the graininess, speckle or noise in an image, - remove the blur from an image, - improve the contrast, - segment the image into regions, e.g. on a printed circuit board, plastic region, copper region. - remove warps or distortions, - code the image in some efficient way for storage or transmission. Some example images and processing follow. The images shown are 26 rows x 52 columns. They are coded using a grey level character coding scheme - 10 levels as follows: white space . , - / = + X B M low value high value Note that this is the 'negative' of what would appear on a normal display, where low=black, high=white. View these images from about 1 metre - or defocus your eyes if closer. Figure 1.9-2(a) shows a simple digital image - a cross, high values an a low value background. The values are all {0..255}. Figure 1.9-2(b) shows a printout of the values in the central rows (11 to 13). 0123456789012345678901234567890123456789012345678901 ---------------------------------------------------- 0| | 1| | 2| | 3| | 4| MMMMM | 5| MMMMM | 6| MMMMM | 7| MMMMM | 8| MMMMM | 9| MMMMM | 0| MMMMM | 1| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 2| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 3| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 4| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 5| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 6| MMMMM | 7| MMMMM | 8| MMMMM | 9| MMMMM | 0| MMMMM | 1| MMMMM | 2| | 3| | 4| | 5| | +----------------------------------------------------+ +0123456789012345678901234567890123456789012345678901 DataLab I-00:00 17/09/94 18:42 Figure 1.9-2(a) Digital Image of a Cross ---------------------------------------- [11,0] (row 11, column 0) 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 [12, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 [13, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 Figure 1.9-2(b) Images values for rows 11-13 of Fig. 1.9-2(a) ------------------------------------------------------------- Figure 1.9-3 shows a similar image, but with noise added; e.g. the original image received on TV in an area of poor reception. 0123456789012345678901234567890123456789012345678901 ---------------------------------------------------- 0|-,-,-.., . ./,.. -., ,.- , ,,./, -- /.-,/,.,.-./,| 1|,//-,- ,.. -,-.-,,,- // /.,,/./- ,/--,,,/,,/,,/.,,| 2|/. ./. ., ./..--- / ,-/ /- .-,...-....-.,, -- .,| 3|/- /..., --/,/ - ,, ,,,-, , /--.// .-/,-.,. ,-/,-/.,| 4|,,,.. - ,, - -,. / , ,B++== . - --, ,-/-- , .,-| 5|, ,-.,,, ,/ /,. .. . .=XXX/,- .. //.,. ../-,,..,./| 6|,-.- ,. //./,..-/./ /- .B+BBB/-- ,-. - ,.- ,. -- ,- | 7|-.-/ ./., ,.,/./,-, , . X+XBX, / ..-. ,,-.-,,-.,,| 8|-,.,/,..--,/.- -. , ,.. =B=XB..,./-.-, .,,.,..,.., /| 9| --.,/-. ,, /...//.-,.,B+XB= , -,---,/--../, ./ | 0| , /-,-,, ./.././. ,-,.+XX/+ --, . . -/.-,,, ..--,.| 1|-.-,BBX+B/=B=/=X=/B+B+B+B/+XX+XX==+X=XBB==X=BB/= - ,| 2|/ ..B+X+XBXXB=B++=XB/B/XBBX=X=+=B=B+X=X=XX+///+=//,-| 3|,- /BBB=B+/==X=+BBX=X=X=+/X=++=X/B=+===B=X=XB=/=--, | 4|-.. =B+B+BBXXX+BB++B=BXMXX/==+=+XBXBB+X/XBXB/+B/,,-/| 5| .-.==XBXXX=X=+++BBBBX+=BB++B+B+=X///BX+XBBXXXX=.,. | 6| . .///- ./ /,-- /-/.==+B/, - ,-,, / ,. / / -/ -| 7| //,.--- ,,-,-. ./,,./X/X++..-,,---. /- ./-,/,,/.| 8| . .//.,-.,. - /-- .-XXX+/,-,.,,.. .,. --.,--,-,.| 9|,.,-, - --- --../ , . ==X/=/,/ -.. -./., -.,., ./,| 0|./ , /-/.//-. ,/- ,-.- XXB=B-. ., ,-/-,.,,, -- ./| 1|-,,, .,/,..- ,. . ,, ,,.XB+BB / .,-. ,, ,--.,-, ., | 2| --.- - ,- -/- -.,/ - / ,/ /. /./,/,.,.,-- ...-| 3|,.,,. - .- . ,-.../, ,..-- -,--,// ,-.- -/ --, , | 4|,. /. ,--,-,// /-- -. .-,---// ,..,.-.,, ---. --| 5| ,. -.-/-- --, ,--...-/,,-/-.-.,-.,.,-- ,/,., ./,/.| +----------------------------------------------------+ +0123456789012345678901234567890123456789012345678901 DataLab I-01:00 17/09/94 18:42 [11, 0] 0.74 0.41 0.72 0.60 1.78 1.78 1.64 1.48 1.92 1.04 1.27 1.79 1.18 1.01 1.19 1.57 1.25 1.09 1.90 1.40 1.78 1.55 1.86 1.49 1.80 1.03 1.36 1.63 1.60 1.41 1.55 1.65 1.19 1.32 1.46 1.75 1.22 1.64 1.87 1.97 1.32 1.31 1.67 1.24 1.91 1.78 1.07 1.14 0.07 0.74 0.02 0.62 [12, 0] 0.98 0.11 0.33 0.35 1.96 1.45 1.58 1.45 1.68 1.80 1.70 1.66 1.84 1.17 1.80 1.53 1.50 1.21 1.57 1.77 1.06 1.86 1.08 1.68 1.87 1.78 1.66 1.30 1.59 1.23 1.54 1.30 1.84 1.16 1.83 1.46 1.60 1.27 1.63 1.21 1.68 1.68 1.51 1.01 1.03 1.01 1.42 1.22 0.98 0.97 0.56 0.79 [13, 0] 0.58 0.69 0.18 0.95 1.89 1.82 1.86 1.30 1.80 1.48 1.00 1.32 1.23 1.77 1.28 1.33 1.85 1.95 1.72 1.31 1.63 1.19 1.59 1.26 1.49 1.04 1.69 1.23 1.50 1.34 1.20 1.75 1.09 1.91 1.22 1.43 1.22 1.19 1.12 1.91 1.28 1.74 1.27 1.73 1.92 1.27 1.07 1.32 0.85 0.75 0.62 0.16 Figure 1.9-3 Noisy version of Figure 1.9-2. ------------------------------------------ Figure 1.9-4 shows an attempt to smooth out the noise - by passing the following 'averaging' window over the image: 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 By 'passing a window' over the image, we simply mean that each pixel in the output image is the average of the original pixel together with its eight neighbours. The smoothing is somewhat successful, but note the blurring of some edges and corners. 0123456789012345678901234567890123456789012345678901 ---------------------------------------------------- 0|,,,,,. ..,..,,,...,,,.. ..,,,..,,..,,,,,,,,....| 1|,,,,,. ..,..,,,...,,,.. ..,,,..,,..,,,,,,,,....| 2|,,,,,. ....,.,.,,,,..,,,,.,.,,,,,..,,,.....,,,,,...| 3|,,.... ....,.,..... ..-----,..... ..,,...,,,,,.....| 4|..,.......,,,., . .,/===--......,,,...,,-,,...,,| 5|,,,.. ....,.,.,..... ,=XXX=-.. ........,,,.....,,| 6|,,,.. ....,,,,,,,,.. . ,=BBB+/,... .... ..,,,.......| 7|,,,......,,,,.,,,,.... ,=XBBX/..... ...............| 8|..,,,,,...,.,.,.,,.... -=XBB+- ......,..,,,.........| 9|...,,-,.........,,.....-=XBX=- ......,,,,,,.........| 0|...,-=//--,,-,---/////-=+XXX=/--,,,--/=///----,.,...| 1|.. -=++++=============++XBXX++=+==/=/=+++===///--,,,| 2|.../+MBBXXXXX+XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX++//,,,| 3|...-+BBBBBBXXXXBBBBXXXBBBX+++++XXBBBXX+XXBBX+++=/-,,| 4|...-=XBBBBXXXXXXBBBBBBXXXXXXXXXXXXX+XXXXBBBBBXX=-,..| 5| .-=+XX++=====+++++X++XXXX+=====+==+===++++==/,...| 6|....,-/=/--,--/-///////=+XXX+/-----------/--///-,...| 7| ...,-,,. ..,...,,,,,/=XXX=,....,,.. .....,,-,,...| 8|..,,,,,,......... .,...-=XXX=-,.,.,.. ......,,,,,,,,| 9|....,,,,,,,. ........ -=XXX=/,.... .,,...,,.,.....| 0|..,...,,,,,,. ....... -=XXX=/.. .. .,,,....... ..| 1|,,,...,,,.,.......... .,/++=/-.. ....,,,...,,,. ..| 2|,,,. ...... ......... ..--//-,.,.,,,,,,,...,,,.. . | 3|..... .,....,.,.,.,. ..,...,,,,,,,,,,,...,,,.....| 4|..... .,,,,......,........,,..,,,,,,..,,,.,..,,,....| 5|..... .,,,,......,........,,..,,,,,,..,,,.,..,,,....| +----------------------------------------------------+ +0123456789012345678901234567890123456789012345678901 DataLab I-03:00 17/09/94 18:44 [11, 0] 0.45 0.45 0.36 0.81 1.09 1.40 1.28 1.28 1.25 1.17 1.13 1.18 1.13 1.09 1.09 1.15 1.14 1.12 1.16 1.19 1.22 1.22 1.24 1.33 1.45 1.58 1.47 1.49 1.27 1.25 1.16 1.25 1.11 1.08 1.04 1.11 1.05 1.17 1.27 1.30 1.23 1.18 1.12 1.10 1.00 1.02 0.94 0.82 0.79 0.60 0.60 0.60 [12, 0] 0.53 0.53 0.48 0.97 1.40 1.75 1.60 1.64 1.55 1.52 1.45 1.44 1.44 1.39 1.41 1.48 1.47 1.56 1.55 1.57 1.51 1.51 1.51 1.57 1.49 1.52 1.41 1.51 1.42 1.44 1.44 1.46 1.47 1.45 1.50 1.46 1.42 1.42 1.54 1.55 1.57 1.50 1.46 1.48 1.43 1.39 1.26 1.02 0.89 0.62 0.58 0.58 [13, 0] 0.46 0.46 0.37 0.81 1.28 1.66 1.63 1.60 1.63 1.63 1.61 1.56 1.55 1.53 1.53 1.62 1.61 1.61 1.58 1.52 1.55 1.48 1.59 1.59 1.60 1.54 1.40 1.39 1.35 1.37 1.38 1.44 1.57 1.60 1.62 1.58 1.48 1.43 1.38 1.47 1.56 1.59 1.60 1.47 1.39 1.35 1.31 1.16 0.92 0.73 0.66 0.66 Figure 1.9-5 Smoothed image --------------------------- Ex. 1.9-1. Verify that the smoothed image (Fig. 1.9-5) is what I say it is. i.e. apply the window to the data in Figure 1.9-4 and check with 1.9-5. Ans: if you look at row 12, column 3 of Figure 1.9-5, you will find that the pixel value is 0.48. Now look at Figure 1.9-3, you will find that the pixels in the 3 x 3 'neighbourhood' of (12,3) are 0.41 0.72 0.60 0.11 0.33 0.35 0.69 0.18 0.95 Add these, and you will get 4.34. Divide by 9 to get 0.48. Figure 1.9-6 shows the result of an 'edge-detector' applied to Figure 1.9-2; i.e. the 'edge image' is low valued in areas where the original is constant valued, and has high values where ther are abrupt changes (edges). The edge detector used was Sobel gradient (see Gonzalez and Woods). 0123456789012345678901234567890123456789012345678901 ---------------------------------------------------- 0| | 1| | 2| | 3| -+++++- | 4| +M+++M+ | 5| ++ ++ | 6| ++ ++ | 7| ++ ++ | 8| ++ ++ | 9| ++ ++ | 0| -+++++++++++++++++++M+ +M++++++++++++++++++- | 1| +M+++++++++++++++++++- -++++++++++++++++++M+ | 2| ++ ++ | 3| ++ ++ | 4| ++ ++ | 5| +M+++++++++++++++++++- -++++++++++++++++++M+ | 6| -+++++++++++++++++++M+ +M++++++++++++++++++- | 7| ++ ++ | 8| ++ ++ | 9| ++ ++ | 0| ++ ++ | 1| +M+++M+ | 2| -+++++- | 3| | 4| | 5| | +----------------------------------------------------+ +0123456789012345678901234567890123456789012345678901 DataLab I-02:00 17/09/94 18:43 [11, 0] 0.00 0.00 0.00 1.00 1.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.50 0.00 0.00 0.00 0.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.50 1.00 0.00 0.00 0.00 [12, 0] 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 [13, 0] 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 1.00 0.00 0.00 0.00 Figure 1.9-6 'Cross' Image after edge detection. ------------------------------------------------ Ex. 1.9-2. If you had an image of a photographic negative, suggest a digital method of producing a 'positive' of this. Ex. 1.9-3. (a) If you had an image of a face - in colour, i.e. three monochrome images fr, fg, fb for red green and blue. You find that the flesh tones correspond to: fr= 100+/-10, fg= 70+/-6, fb= 50+/-10 Sketch an algorithm that will convert the flesh colour to pure bright white (255,255,255), while leaving other tones untouched. Ex. 1.9-3 (b) You have a satellite image in blue, green, red, and infra-red (4 mono. images this time, fr, fg, fb, and fir). You find that water (landuse type 1) is usually (m11+/-s11, m12+/-s12, m13+/-s13, m14+/-s14) farmland (type2) (m21+/-s11, m22+/-s12, m23+/-s13, m24+/-s14) Sketch an algorithm that will produce a label image (ie. landuse map), fl, containing 1 where there is water, 2 where there is farmland, and 0 otherwise. Ex. 1.9-4 Suggest a window (or algorithm) for detecting bright spots one pixel wide. Ex. 1.9-5 You have an image stored in a 2-dim. array in C (convert to Modula-2 if you wish): #define NROW 64 #define NCOL 64 unsigned char f1[NROW][NCOL],f2[NROW][NCOL]; int rl,rh,cl,ch,r,l; float val; /* read in image...*/ rl=0;rh=NROW-1; cl=0;ch=NCOL-1; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ val=(float)f1[r][c]; val=val+2; f2[r][c]=(unsigned char)val; } } The above adds 2 to each pixel of image f1 and puts the result into image f2. Show how you would: (a) Add two images f1 and f2 to produce a third f3. What precautions should you take. Hint, what is range of unsigned char. (b) Negate f1 and put the result in f2. (c) Apply a 3 x 3 smoothing window, see above. (d) Implement the algorithm that is the answer to Ex. 1.9-3., assuming three input images fr1, fg1, fb1 and three output images fr2, fg2, fb2. (e) Implement the algorithm in 1.9-3 (b) assuming your parameters, w and s, are stored in two-dimensional arrays m[NTYPE][NBAND] and s[NTYPE][NBAND], and your multispectral image is stored in f[NROW][NCOL][NBAND]. 1.10 Image Processing Practical & Demonstration Software. -------------------------------------------------------- 1.10.1 General. ------------- For practicals we will mostly use the DataLab image processing package - see below. DataLab is on the network server: u:\jc\dl. An associated directory contains some image data: u:\jc\imdat. 1.8.2 DataLab. ------------- Here is part of the DataLab users manual. Full copies of it and the programmers manual are available on the VAX directory VAX: [cfhv.ip]dlusr.a, dlprg.a). DataLab is a portable software laboratory for research and development of algorithms in signal processing, image processing, pattern recognition, estimation, and general mutivariate data processing. DataLab can be used as a stand-alone 'end-user' tool / package. However, its primary aim and use is as a software environment for the rapid implementation and evaluation of algorithms. It has grown from being a personal research tool to one which serves as (i) the development environment of a signal and image processing systems research group at University of Ulster, (ii) as a development tool for final year and doctorate projects, and (iii) as a teaching aid for a course in image processing. The specific objectives were threefold: first, to provide a seamless environment for developing and experimenting with algorithms in the fields mentioned, second, to provide an good developers's environment - i.e. to facilitate the implementation of new algorithms with NO systems programming, third, for use as a teaching tool. DataLab provides an environment for processing and handling signal (1-D), image (2-D), and multivariate data sets. As well as a comprehensive set of some 200 commands / DataLab functions, there is a rich toolkit of both scientific and general houskeeping routines. It is programmed entirely in ANSI C. It requires no special hardware, i.e. it runs on a bare PC with SVGA graphics, it does not require 386/486, it does not require extended memory - rather, unfortunately, it cannot use memory above DOS 640 K barrier. In DataLab all data are stored as DataLab images; DataLab is modelled as an image calculator with number of image 'memories' / stores. These image objects are held in main memory. Thus, for example, a user may create an image, put in image '0', read another from a file, put in image '1', then add images '0', and '1' and put the result in image '2'. In computing, the term 'image' usually means a digital image, i.e. a set of colour / band 'planes' each containing a two-dimensional array of numbers. A DataLab image can, by specialisation, be used to store a wide range of data structures. The most general structure is a multidimensional image which could represent, amongst other things, a multicolour two-dimensional spatial intensity pattern, i.e. whose pixels are vector valued, or a time sequence of monochrome images. Next, an image with only one row, is a digital-'signal', i.e. a discrete data sequence. This, too could be vector valued. For 'sets' of data, i.e. data collections in which there is no relationship between individual members, we can again use one row of an image. A single scalar can be stored as an image with one plane, one row, and one column! At present, a complex image must be represented as a pair of reals. DataLab supports matrix and vector mathematics; matrices are single dimensional images; vectors are single dimensional, single row, images. When image objects are being defined, DataLab allows users to specify representation data type: BYTE [0..255], INT [-32768 .. 32767], LINT [32 bit signed integer], REAL [32 bit IEEE floating- point]. However, in most situations, elementary data are treated as 'numbers' - with integers, and floating-point types treated uniformly. In addition to the primary image data, a DataLab image may possess two additional ancillary blocks of data: label data - which are typically used for class labels in pattern classification experiments, and ancillary data, typically the dependent / concomitant data in estimation experiments. Clearly, some DataLab functions are restricted in the type / structure of images they can handle; for example, the operation of an two-dimensional edge detector is not defined for an image with only one row, the inverse Discrete Fourier transform requires a pair of images, etc. Nevertheless, many functions eg. 'add' operates uniformly on any structure of data. In general, the principle adopted is 'if the user specified it', try to do it! With integer data types, especially BYTE, users must be very vigilant of overflow and underflow. The following list gives an indication of the range of functions available in DataLab. - Arithmetic: typically, add two images to produce a third. - Display: all graphic and text displays, including to printer and file. - Miscellaneous Data Handling: includes creation and deletion of images, copying. - Data Generation: generation of test data and patterns, random and deterministic. - File Handling: file input-output, both formatted(ASCII) and unformatted. - One-dimensional Digital Filters: simple recursive digital filters. - Correlation and Convolution: 1- and 2-dimensional correlation using 'fast' FFT methods. - Image Enhancement: smoothing, edge detection, median filtering etc. - Fourier and other Transforms: one-dimensional DFT operations (using FFT), two-dimensional DFT, Wavelet transform, Walsh transform, Hough transform. - Pattern Classification: numeric pattern classification, especially statistical pattern recognition; includes multilayer perceptron (backpropogation trained) neural network, and fuzzy rule-based classifer. - Estimation: estimation, eg. multivariate linear regression; includes multilayer perceptron (backpropogation trained) neural network estimation, and fuzzy rule- based estimation. - Feature Extraction and Discriminant Analysis: linear transformations for feature extraction and discrimination. - Two-dimensional Texture Analysis: eg. various measures obtained from co- occurrence matrix. - Two-dimensional Shape Recognition: two-dimensional moments. - Image Morphology: both binary and grey-level. - Matrix and Vector Arithmetic: provides a basic calculator for matrix and vector arithmetic. STOP PRESS: DataLab now runs under Linux, under the X-Window System -- I will give you details of where to access the software at the first lecture. Demo Files ---------- If you type 'demoX' x=1,2,3... in response to the DL prompt :> , DL will run through some of the examples that we have talked about. 1.9 Sample Examination Paper. ---------------------------- Examination Autumn Term 1994 ---------------------------- Answer any FOUR questions. All questions carry equal marks. The Appendix is intentional. 1. Machine Inspection Applications of Image Processing. You are developing a system for automatic inspection for denim fabric manufacture. Under the following headings, describe the major issues that will occupy you - giving where possible definite problems, solutions, and other suggestions. (a) image acquisition, (b) preprocessing, e.g. calibration, noise reduction, (c) data volumes and data rates, (d) algorithms for fault detection, (e) economic rationale for such a system. [25 marks] 2. Linear Shift-invariant Filters, and Non-linear Filters. (a) Compare and contrast linear shift-invariant filters, with non-linear filters; [Suggestion: you could indicate what is the essential matter that distinguishes linear from non-linear, what implications this has on implementation, and give specific examples of instances and applications of each] [12 marks] (b) The image given in Figure 2-1 is corrupted with 'salt' (the 99s) and 'pepper' (the 01s and 02s) noise. Discuss at least TWO possible filtering methods to reduce this noise, pointing out their strengths and weaknesses specific to this application. [We do NOT require you to filter the whole image, just pick key areas / pixels to exemplify your answer]. [8 marks] 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 01 21 21 21 21 21 21 21 21 02 21 21 21 21 21 21 21 21 21 21 99 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 99 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 42 42 42 02 42 42 99 42 21 21 21 21 21 21 21 21 42 42 42 42 42 42 42 42 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 99 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 21 Figure 2-1 2(c) What is the general principle behind the use of summation / averaging in noise reduction? [5 marks] 3. Edge Enhancement and Detection. (a) Explain the use of gradient operators (filters) in image sharpening and edge enhancement; explain further steps that must be performed to fully DETECT edges; be careful to explain the role of gradient. [12 marks] (b) Use a Sobel gradient filter to enhance the edges in the image in Figure 3-1. Comment on your result. [8 marks] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 3-1 (c) Describe one practical application of edge detection. [5 marks]. 4. Discrete Convolution. The following equations define two-dimensional discrete convolution: N-1 M-1 (4-1) g[r,c] = sum sum f[r-k,c-l].h[k,l] k=0 l=0 r c (4-2) g[r,c] = sum sum f[k,l].h[r-k,c-l] k=r-N+1 l=c-M+1 In each case, h[.,.] is an N x M two-dimensional image, row: [0..N-1], column: [0..M-1]. For part (b), assume that f[.,.] and g[.,.] are NR x NC. (a) Pick one of them (*), explain the components, and explain, physically / practically, what it is doing. You should specifically explain which part of the equation is often called the impulse response, any why. [12 marks] (*) If you prefer, use the following ONE-dimensional equivalents, and answer the same question - the marks are the same. M-1 (4-3) y[n] = sum x[n-m].h[m] m=0 m=n (4-4) y[n] = sum x[m].h[n-m] m=n-M+1 In each case, h[.] is an M sample discrete sequence. For part (b), assume that y[.] and x[.] are N samples long. (b) Referring to one of the equations above, write a computer program, to implement either one- or two-dimensional convolution. [Use any language you wish, or pseudo-code if you prefer; do NOT concern yourself with edge effects]. [6 marks] 4(c) Convolve f[.,.] with h[.,.] and comment on the result. [7 marks] h[.,.] = 0 1 2 1 1 2 1 2 3 f[.,.] = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5. Discrete Fourier Transform. (a) Explain how the Discrete Fourier Transform can be used to do spectrum analysis. [8 marks] (b) Compute the frequency spectrum of a digital signal containing an 'impulse' x[n] = {1,0,0,0,0.....}; [7 marks] (c) How can a version of the Discrete Fourier Transform be used to perform 'fast' convolutions or correlations; give numerical examples of the performance improvements that can be obtained. [8 marks] 6. Pattern Classification. Figure 6-1 summarises a general pattern classification system. +-----------------+ +---------------+ | | | | Observ- | Feature |Feature | Classifier |Deci --------->| Extraction |-------->| |---> -ation x | |Vector y | |sion Vector | | | | w +-----------------+ +---------------+ Figure 6-1 Pattern Classification System ---------------------------------------- (a) Explain the components of Figure 6-1. [10 marks] (b) Explain the roles of EITHER distance, OR, discriminants, in pattern classification algorithms. [5 marks] (c) Table 6-1 below shows two-dimensional feature data for a two- class problem. (i) Explain the use of a nearest mean classifier, and one other relatively different classification method, to solve the problem. (ii) Verify that either of the dimensions, alone, is inadequate to separate the data. [10 marks] [Hint: a scatter diagram, and / or a histogram will help you a lot]. Class x1 x2 ---------------- 1 0.4 1.0 1 1.0 0.4 1 1.0 1.0 1 1.0 1.6 1 1.6 1.0 2 1.4 2.0 2 2.0 1.4 2 2.0 2.0 2 2.0 2.6 2 2.6 2.0 Table 6-1 Two-class, Two-dimensional feature data. -------------------------------------------------- 7. Neural Networks and Pattern Recognition. (a) What is meant by the statement, "neural networks are trained, not programmed", and explain one major advantage, and one major disadvantage of this fact. [10 marks] (b) What is the significance of the XOR function in the history and theory of neural networks. [6 marks] (c) Figure 7-1 below shows four two-pixel images and their associated classes (class 0 or class 1); '*' denotes bright, value 1, blank denotes dark, value 0; describe a neural network that will distinguish class 1 objects from class 0. [9 marks] x1 x2 +-----+-----+ |*****|*****| class 1 |*****|*****| +-----+-----+ x1 x2 +-----+-----+ |*****| | class 0 |*****| | +-----+-----+ x1 x2 +-----+-----+ | |*****| class 0 | |*****| +-----+-----+ x1 x2 +-----+-----+ | | | class 0 | | | +-----+-----+ Figure 7-1 ---------- 8. Data Compression. (a) Distinguish between LOSSY and LOSSLESS data compression; give instances of each, and give examples of applications that CAN, and CANNOT tolerate lossy compression. [6 marks] (b) Describe a numerical measure of information (see Appendix); in your answer, as well as giving a precise definition of the measure, you should give intuitive explanation of how it can be applied to any sort of 'information', not just numerical data. [8 marks] (c) A two-bit image has relative frequency of occurrence (probabilities) of grey levels as follows: grey-level(i) 0 1 2 3 frequency (pi) 0.5 0.25 0.125 0.125 (i) show that the average entropy per pixel is 1.75. [3 marks] (ii) Suggest a coding scheme that will allow the image to be compressed from two bits per pixel to an average of 1.75 bits per pixel. [We do NOT require detailed working of any algorithm or formula]. [2 marks] (iii) Prove, either mathematically, or intuitively, that your scheme in (ii) will actually achieve 1.75 bits per symbol.[3 marks] (iv) Identify ONE difficulty with single symbol entropy based methods, and suggest one way the difficulty can be reduced. [3 marks] Appendix A: Formulae and Algorithms for Image Processing -------------------------------------------------------- Sobel Gradient Operators: hv = -1 0 +1 hh = -1 -2 -1 -2 0 +2 0 0 0 -1 0 +1 +1 +2 +1 Discrete Convolution: --------------------- +inf y[n] = x[n] (*) h[n] = # x[n-m].h[m] m=-inf For finite sequences: N-1 y[n] = x[n] (*) h[n] = # x[n-m].h[m] m=0 For images: N-1 M-1 g[r,c] = sum sum f[k,l].h[r-k,c-l] k=0 l=0 Integral Fourier transform: --------------------------- +inf X(w) = (1/2.PI) ! x(t).exp(-jwt) dw -inf Inverse transform: ----------------- +inf x(t) = (1/2.PI) ! X(w)exp(+jwt) dw -inf Discrete Fourier Transform: --------------------------- N-1 X[u] = (1/N)sum x[n] exp(-j.2PI.u.n/N) n=0 Inverse DFT: ------------ N-1 x[n] = sum X[u] exp(+j.2PI.u.n/N) n=0 Linear Contrast Stretch. ----------------------- z' = T(z) = (zG-z0).(z-a)/(b-a) + z0 where z = input value z'= output value z0..zG = desired output range a..b = input range. Histogram / Probability Distribution Transformation. ---------------------------------------------------- Notation: Input grey levels z, Output grey levels w, Transformation (lut) w = T(z), Input probability distribution p(z), cdf P(z) Desired (output) distribution q(w), cdf Q(w) Np = number of pixels in image. Algorithm: (1) Estimate frequency of grey levels (histogram), ph(z). (2) From ph(z) compute p(z) = ph(z)/Np. (3) Form cdfs P(z), and Q(w), from p() and q(), respectively, z' eg. P(z') = sum p(z) z=z0 (4) for each z in the input do: choose w such that Q(w) is closest to P(z), set T(z) = w; (5) Apply T(z) to input image f1[r,c]: f2[r,c] = T(f1[r,c]) Entropy. -------- Average entropy, Hav = - sum { p(zi).log2[p(zi)] } Logarithm to base B in terms of log10. ------------------------------------- logB(x) = log10(x)/log10(B). log10(2) = 0.3010 ie. log2(x) = log10(x)/0.3010 = 3.322 . log10(x) Huffman coding procedure. ------------------------- 1. List the symbols, along with their probability. 2. Pick the two least probable nodes. 3. Make a new node out of these two - adding the probabilities. 4. Repeat steps 2, 3, until only one node remains - the root. 5. Bit allocation. Starting at the root, allocate 1 to one branch (left, say), and 0 to the other; until all branches have been allocated. 6. Read out the codes. Starting at the root, travel to the leaf that represents each level, reading off the bits that make up the code for that level. Sigmoid activation function. ---------------------------- sigmoid(x) = 1/(1+exp(-x)) Procedure k-Means Clustering: Inputs: nc = number of classes, xij data to be clustered; i=1..d; j=1..N d = dimensionality of data vectors N = total number of pixels. Outputs: labj, j=1..N, labels of pixels xj labj = {1..nc} 1. Partition the pixels arbitrarily into nc classes (labels), eg. divide the image into nc regions, label all the pixels in the first region '1', second region, label '2', etc. 2. Compute the mean vectors of each class: m[i][c], i=1..d, c=1..nc 3. Procedure: k-Means Classifier: 3.0 Change = False 3.1 for j = 1..N do 3.2 reset Distmin = Bignumber, class = 0 3.3 Find nearest mean: 3.3.1 for c = 1..nc do 3.3.2 Compute Dist = Distance (xj to mean mc) 3.3.3 if Dist < Distmin then Distmin = Dist class = c 3.3.4 end. 3.4 labj = class; if 'changed' set Change = True. 3.5 end. 4. if Change = True goto 2 (*loop again*) 5. end. (* we are finished*) Function Distance(x,m) (*this calculates squared Euclidean distance - because there is no need to take square root, since we are not interested in actual values, just in finding the minimum*) Dist = 0 for i = 1 .. d do Dist = Dist + (xi -mi)**2 (*Euclidean distance*) return Dist end. end of examination paper. end of file.