University of Ulster, Magee College. Faculty of Informatics. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1996. Lecturer: J.G. Campbell. Date: 04/08/96 University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1995. Lecturer: J.G. Campbell. Date: / /95 Examination Autumn 1995 ----------------------- ANSWERS and MARKING ------------------- General note to external examiner. This year you may notice three slight variations from previous years: (a) In many of the opening recall parts, I have tried to make the questions more open, e.g." ... explain the significance of ...". This will give the students more freedom to discuss widely, and perhaps include matter found outside lectures and notes (and my prejudiced head!!). (b) I have tried to give choice within questions. On the other hand, I have slightly reduced choice across questions. (c) Related to (b), in some places, I have tried to allow students to express their understanding of a topic in the form of implementation as a computer program. This addresses two issues: (i) many of these students are poor at mathematics and algebraic reasoning, and the operational description is the best they can do; (ii) we have tried to introduce more computer programming into the practical part of the course, and the coursework. If this paper acceptable to you, students will be warned of these slight shifts of style. 1. Digital Image Fundamentals and Image Sensing. [Answer (a), (b), and EITHER (c) OR (d)] (a) Explain in detail, employing diagrammatic illustrations where appropriate, the process of the creation of a two-dimensional 'digital image' from a three-dimensional scene? [10 marks] ANS. Scene: ----- is the world scene in front of the camera. [1 mark] Effectively, the three-d. scene (or the light emanating from it) must be projected onto a two-dimensional image (plane). [2 marks] Image ----- The term image or, strictly, monochrome image, refers to a two-dimensional brightness function f(x,y), where x and y denote spatial coordinates, and the value of f at any point (x,y) gives the brightness (or, grey level) at that point. [2 marks] Digital? -- two senses [3 marks] -------- The mono. image, f(x,y), mentioned above is continuous (or analogue in some parlance), in the senses that f is a real number, and that x, and y are real numbers, ie. you can achieve infinitessimally fine resolution in f, x, and y. As always in computers we must use 'digital' or 'discrete' approximations. We approximate f by restricting it to a discrete set of grey levels (often, in image processing systems, an 8-bit integer {0..255}, and we sample f at discrete points xi, i=0..n-1, and yj, j=0..m-1, see Figure 1.6-1. To be consistent with most textbooks and published papers, I have transposed the normal usage of x- and y-axes. 0 ymax 0 m-1 +--------------------> y +---------------------> j 0 | . . 0 | fd(1,1) fd(1,2) | x1,y1 x2,y1 | | | | . | fd(i,j) | xi,yj i | | | | | xmax| n-1| fd(n-1,m-1) V V Continuous image: Discrete: domain of f(,) is domain of fd(,) is [0,xmax] x [0,ymax] {0..n-1} x {0..m-1} ie. an n-row x n-column image (n x m) Figure 1.6-1 Correspondence of continuous and discrete axes. ------------------------------------------------------------ To remove any ambiguity about which axis each index (i,j) refers to, it can be beneficial to use f(r,c), where r = row, c= column. Thus, we arrive at a digital image: fd(r,c) where fd can take on discrete values {0..G-1} and r = {0..n-1}, c = {0..m-1} which can be viewed as a matrix (or two-dimensional array) of numbers: [fd(0,0) fd(0,1) ... fd(0,m-1)] [fd(1,0) fd(1,1) ... fd(1,m-1)] [ ] fd(i,j)= [ ] Eq. 1.6-1 [ ] [fd(n-1,0) fd(n-1,1) ... fd(n-1,m-1)] (from now on we will drop the d, and use f(i,j) ) Raster sampling or scanning. [2 marks] ---------------------------- The image model given above corresponds to the image model used in raster graphics, ie. the image is formed by regular sampling of the x-, and y-axes. [1 mark] Pixel. ------ Each f(r,c) in eq. 1.6-1 is a picture element or pixel, (the equivalent term 'pel' is used in some texts). [1 mark] Spatial Resolution. ------------------- Spatial resolution (or just resolution) is high if the samples xi, yj are closely spaced, is low if widely spaced. Clearly, the higher the resolution, the larger are m, n for the same area (of the continuous image). The ground resolution of weather satellites is about 1 to 5 Kilometers. The ground resolution for some spy satellites is reported to be 15 centimeters. [2 marks] Grey-Level Resolution. ---------------------- With proper selection of the digitisation range, it is usually possible to represent (without any humanly perceivable degradation) mono. images with 8-bits (the psychologists tell us that humans can perceive no more than 160 levels at once). Also, in my experience, there appears to be some natural law that says that most image sensors cannot deliver any useful higher amplitude resolution. [2 marks] [Max 10 marks] (b) Equation 1-1 describes the sensing of a two-dimensional (planar) scene: f(x,y) = b(x,y) + h(x,y).r(x,y).i(x,y) + n(x,y) (1-1) where f(x,y) is the value recorded for the pixel corresponding to position (x,y) in the plane, r(x,y) is the reflectivity at (x,y), i(x,y) is the illumination incident on the plane at (x,y), h(x,y) is the gain of the sensor system for pixel (x,y), b(x,y) is the bias of the sensor system for pixel (x,y), n(x,y) is the additive noise at pixel (x,y). Discuss equation 1-1, paying attention to how the user of the image is likely to interpret the image data, and how the various unwanted elements (data) and their interactions may be eliminated. [9 marks] ANS: [Note to external examiner. The students will have encountered this equation in class, and will have been invited to do a worked example based on it; nevertheless, given the way it is phrased, I would certainly not consider it a 'straight-recall' question, and those who get anywhere near full marks will have shown particular insight and understanding]. - user most likely wants r(x,y), since this is characteristic of the scene -- he probably doesn't care much about the illumination, or sensor, would really like to ignore them, [2 marks] - noise reduced by grabbing multiple images (over time) and averaging; reduce by sqrt(n), [2 marks] - bias and gain can appear as part of image, i.e. wanted image has them superimposed on it, e.g. constant scene not constant! [2 marks] - i(,) and h(,) interact, and so can be calibrated out together, typically by white or constant grey card, [3 marks] - bias can be calibrated by shutting off all light entering sensor. [2 marks] MAX 9 marks (c) a visible colour image may be represented by three 'bands', nevertheless, in a digital image processing system, it is possible to envisage images of very many more than three bands. Discuss. [6 marks] ANS: ---- [lose half marks (3) if neither 1. nor 2. mentioned at all] 1. 'band can represent anything, e.g. population density, altitude, [2 marks] 2. three primary colours OK for visible, 3 filters, etc. [2 marks] 3. electromagnetic spectrum / wavelength: infinity of bands, microwave etc. [2 marks] 4. or equivalent gravity [2 marks] Max. 6 marks OR (d) discuss image representation in computer memory. [6 marks] ANS. - plain two-d. array 2 marks, - dynamic (run-time configurable size etc.) 1 mark - discussion of types: [1 mark for trivial], 1 more for elaboration of compromise, - sparse images, non-raster [2 marks] - equivalent gravity [2 marks] MAX. 6 marks. 2. Image Enhancement. (a) In the context of image enhancement describe the principles underlying: point operations, and neighbourhood operations. Your answer should should include representative examples of each, together with illustrations of their operation, and descriptions of applications for which one type is more suitable than the other. [10 marks] ANS: - point: only individual input pixel value considered in computing output, - neighbourhood: output a function of a number of pixels, 3 marks - examples identified 3 marks - illustrations of operation 4 marks - applications 4 marks MAX 10 marks (b) Equation 2-1 defines a statistics based contrast stretching transformation, z' = (z-m) . s'/s + m' 2-1 where z', and z are, respectively, the output and input values, s', s are the output and input standard deviations, m', m are the output and means. Discuss this transformation both in the context of general image enhancement and of part (a); e.g. compare it to other transformations, and to other methods used to obtain the transformation parameters, discuss in what circumstances might it be used, give illustrative examples, etc. [9 marks] ANS: - non-subjective method of equalising the average grey level and contrast of a number of images (normalisation); the method given in this section and the next 3 marks - (histogram modification) meet this aim. - comparison with other methods 1 mark each e.g. Suppose we have an input image whose mean grey level is m and standard deviation is s. We desire m' and s'. This is simply shifting and scaling again (see chap 4.5) and the transformation given in eqn. 4.7-1 produces an output image with the desire mean and std. dev. (4.7-1) z' = (z-m) . s'/s + m' ^ ^ ^ | | | shift scale shift to mean = m' to mean = 0 Compare eqn 4.5-2 (4.5-2) z'= (zG-z0).(z-a)/(b-a) + z0 - it is a point operation 1 mark - circumstances used: all data in small range, see (c), needs to be stretched 2 marks - discussion of meaning of standard. dev. and mean 2 marks MAX. 9 marks (c) Apply equation 2-1 to the image shown in Figure 2-1, using the values m' = 128, s' = 40. Comment on the significance of the values m' = 128, and s' = 40. [6 marks] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 Figure 2-1 5 x 5 image ----------------------- ANS: sum(f[r,c]) sum({f[r,c]-m)}^2) 1 1 1 1 1 5 20 [ (3 - 1)^2 = 4 x 5 ] 2 2 2 2 2 10 5 3 3 3 3 3 15 0 4 4 4 4 4 20 5 5 5 5 5 5 25 20 -- -- m = 75/25 = 3; v = 50/25 = 2 s = sqrt(v) = 1.414 z' = (z-m) . s'/s + m' 2-1 (z-3) x 40/1.414 + 128 say 40/1.414 = 30 z'(1) = -2 x 30 + 128 = 68 2 = -1 x 30 + 128 = 98 3 = 0 x 30 + 128 = 128 4 = +1 x 30 + 128 = 158 5 = 188 4 marks or pro rata Significance: 3 x sigma spread, formula should transform data to mean 128, spread +/- 120, ie. filling range of 0..255 display. 2 marks 3. Convolution, Edge Detection, and related. [Answer parts (a), EITHER (b) OR (c), and (d)] (a) Explain the importance of 'convolution', in EITHER digital signal processing (one-dimension), OR digital image processing (two-dimensions). [9 marks] ANS: - very open; expect 3 x points properly identified, explained, preferably with examples; e.g. - convolution = general operator, some weights => smoothing, some sharpening, i.e. same operator, different parameters; averaging, differentiation. - models ANY linear time (shift) invariant operation, - operation exactly characterised by impulse response (== weights), - explanation of operation may not be rewarded, (b) Explain, using appropriate examples, application of the Prewitt operators shown in Figure 3-1; in addition, explain the significance of each operator, and how their results are normally combined. [9 marks] hr = -1 0 +1 hc = -1 -1 -1 -1 0 +1 0 0 0 -1 0 +1 +1 +1 +1 (a) vertical edges. (b) horizontal edges. Figure 3-1 Prewitt Operators ANS: - convolution, - sliding window / mask across image, - weighted sum at each point, - discrete differentiation (differencing), - differentiation = gradient = steepness = edginess, - one enhances horizontal edges, the other vertical; need example, - combination to get overall edge, sum of absolutes, or sqrt(sum of squares); possible justification, - appropriate example. Marks: operation 3 marks, theory 3 marks, examples and/or insight 3 marks. OR (c) Write a computer program (or computer programs) to apply the Prewitt operators shown in Figure 3-1; explain how their results are normally combined. Use any language of your choice, or pseudo- code. [9 marks] ANS: - basics of program (see below) 7 marks or pro-rata for fully parameterised / general like below; 5 for special purpose -- which, in any case, is likely to be a mess, and more difficult than the general! - combination 2 marks. static void conv2(IMD *g,IMD *f,IMD *h,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valh,sum; Lint r,c,k,l,kk,ll; char s[80]; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0; /* below equivalent to eqn. 3.8-6, chapter 3 wh c=vh g[r,c] = # # f[r-k,c-l],h[k,l] k=wl l=vl */ for(k=wl;k<=wh;k++){ for(l=vl;l<=vh;l++){ IMDget(&valf,d,r-k,c-l,f); IMDget(&valh,d,k,l,h); sum+=valf*valh; } } IMDput(&sum,d,r,c,g); } } return; } (d) Figure 3-2 shows a character representation of an image of a cross; 'M' represents pixel value 1, while blank represents 0. Describe the results that would be obtained using Prewitt edge enhancement. Identify and discuss examples of common problems associated with edge enhancement, and suggest pre-, and / or post- processing that may eliminate them. [7 marks] ANS: Result something like follows, with noise points (individual ' 's and 'M's) enhanced as well as edges, 4 marks Problems: noise enhanced; eliminate -- median filtering 1.5 marks width of edge -- thinning 1.5 break in edge (top of rh. arm) -- edge tracking 1.5 MAX. 3. 0123456789012345678901234567890123456789012345678901 ---------------------------------------------------- 0| | 1| | 2| M M | 3| | 4| MMMMM | 5| MMMMM | 6| M MMMMM | 7| MMMMM | 8| MMMMM | 9| MMMMM | 0| MMMMM | 1| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM MMMMMMMMMM | 2| MMMMMMM MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 3| MMMMMMMMMMMMM MMMMMMMMMMMMMM MMMMMMMMMMMMMMM | 4| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 5| MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM | 6| MMMMM | 7| MMMMM | 8| MMMMM | 9| MMMMM | 0| MMMMM | 1| MMMMM M | 2| | 3| | 4| | 5| | +----------------------------------------------------+ +0123456789012345678901234567890123456789012345678901 4. Discrete Fourier Transform. [Refer to formulae in the Appendix, as neccessary.] (a) Give an interpretation of the discrete Fourier transform (DFT), and explain its significance in EITHER digital signal processing (one-dimension), OR digital image processing (two-dimensions). [10 marks] ANS: - discrete version of integral Fourier transform, - explain exp(jwt) -> cos(wt) + j.sin(wt), - DFT ANALYSES signal / image into cos / sin functions, - inverse SYNTHESISES, - transform from time / space to frequency, - spectrum analysis, - use in implementing filters, - use for implementing convolution. 4 x significant points (2.5 marks each) OR (b) Write a computer program to implement, with appropriate comments, the one-dimensional DFT; Use any language of your choice, or appropriate pseudo-code. [10 marks] ANS: program showing understanding of principles, 7 marks or pro-rata, - insightful comments 3 marks. - let us assume that we are transforming forward, and that the signal is real. Hence, the software, function rfdft (Real, Forward DFT): void rfdft( float x[], /*- real input -*/ float xro[], float xio[],/*- real and imag. outputs -*/ int N) /*- length -*/ { float pi = 3.1415926,tpi; int n, u; tpi=2.0*pi; for(u=0;u 4-4 (two impulses), 4-2 impulse -> 4-3 white spectrum no marks for guess, 4 marks for correct answer + intuitive explanation, 4 more marks for numerical working out of key elements, 1 * x[n] | * * | | +---1---*---3---4---5---*---7----------------------> n | | | * * -1 | * (b) ---------------------------------------- n 0 1 2 3 4 5 6 7 ---------------------------------------- x[n] 1 0.707 0 -.707 -1 -.707 0 0.707 ---------------------------------------- (a) Figure 4-1 1 * x[n] | | | +---*---*---*---*---*---*---*-----> n 0 1 2 3 4 5 6 7 (a) -------------------------------------- n 0 1 2 3 4 5 6 7 -------------------------------------- x[n] 1 0 0 0 0 0 0 0 -------------------------------------- (b) Figure 4-2 --------------------------------------------- u 0 1 2 3 4 5 6 7 --------------------------------------------- X[u] .125 .125 .125 .125 .125 .125 .125 .125 --------------------------------------------- Figure 4-3 --------------------------------------------- u 0 1 2 3 4 5 6 7 --------------------------------------------- X[u] 0 0.5 0 0 0 0 0 0.5 --------------------------------------------- Figure 4-4 (d) Hence, or otherwise, deduce, with brief justification, the DFT of the sequence given by Figure 4-5. [2 marks] ANS: 4-4 = 4-1 + 4-2 ; DFT linear, therefore simply add DFTs --------------------------------------------- u 0 1 2 3 4 5 6 7 --------------------------------------------- X[u] .125 .625 .125 .125 .125 .125 .125 .625 --------------------------------------------- 2 * | | 1 | x[n] | * * | | +---1---*---3---4---5---*---7----------------------> n | | | * * -1 | * ---------------------------------------- n 0 1 2 3 4 5 6 7 ---------------------------------------- x[n] 2 0.707 0 -.707 -1 -.707 0 0.707 ---------------------------------------- Figure 4-5 (e) Hence, or otherwise, deduce, with brief justification, the DFT of the convolution of sequence in Figure 4-1, with that in 4-2. Comment on the significance of this result. [5 marks] ANS: Convolution in time == multiply in frequency, therefore [2.5 marks] Discussion convolution with impulse == no change 2.5 marks Slight complication of factor of N (8) 1 mark --------------------------------------------- u 0 1 2 3 4 5 6 7 --------------------------------------------- X[u] 0 .0625 0 0 0 0 0 .0625 --------------------------------------------- 5. Data Compression (a) Discuss entropy as a measure of information. [8 marks] ANS: 4 x points clearly identified, explained / justified, preferably with examples. (b) Verify, for the image shown in Figure 5-1 (histogram in Figure 5-2), that the average entropy per symbol is 2.55. [4 marks] ANS: see pp. 83,85 of Rosie, A.M. (1973), Information and Communication Theory, van Nostrand Reinhold, with A-> 0, B-> 1, C-> 2, D-> 3, E-> 4, F-> 5, G-> 6, H-> 7. pi 0.1 0.18 0.4 0.05 0.06 0.1 0.07 0.04 entropy H = -#pi*log(pi) = 2.55 bits per symbol. 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 7 7 7 7 Figure 5-1 Image 10 x 10 ---------------------------------------------------- z 0 1 2 3 4 5 6 7 ---------------------------------------------------- p(z) 0.1 0.18 0.4 0.05 0.06 0.1 0.07 0.04 ----------------------------------------------------- Figure 5-2 Probabilities for Figure 5-1 (c) Figure 5-3 shows Huffman code lengths for the same data; calculate the average length per symbol for your Huffman code, and explain any difference between this and the average entropy. [4 marks] ---------------------------------------------------- z 0 1 2 3 4 5 6 7 ---------------------------------------------------- p(z) 0.1 0.18 0.4 0.05 0.06 0.1 0.07 0.04 ----------------------------------------------------- L(z) 3 3 1 5 4 4 4 5 ----------------------------------------------------- Figure 5-3 Probabilities and Huffman Code Lengths The lengths of codewords are A-> 0, B-> 1, C-> 2, D-> 3, E-> 4, F-> 5, G-> 6, H-> 7. pi 0.1 0.18 0.4 0.05 0.06 0.1 0.07 0.04 Li 3 3 1 5 4 4 4 5 Lave = #pi.Li = 2.61 [2 marks] Difference -- p(z) not powers of two [2 marks] (d) Explain the two chief reasons why the Lempel-Ziv compression scheme can normally outperform Huffman compression and is nowadays, therefore, the algorithm of choice for general applications (disc- file and data communications). Illustrate the principle or operation of the Lempel-Ziv algorithm on the binary sequence below; assume that 0, 1 are already stored in the codebook. [9 marks] [Essentially, in L-Z, encoding is performed by parsing the source data stream into substrings that are the shortest substrings which have not already encountered.] Ans: - approx 3 x 2 = 6 marks for 'two chief reasons' -- maybe more if real insight shown; possible compensation with next part. Some Problems with Single Symbol Source Coding. ----------------------------------------------- 1. Symbols are NOT independent, redundancy exists in strings of symbols. Single symbol coding cannot exploit this potentially rich source of compression. Ideally, we would like to compute the probabilities for all pairs of characters {x1, x2}, hence requiring 65536 (why this number?) probabilities to be estimated; in this case, we would expect {'t','h'} to have a high probability value, and, as mentioned earlier, {'q','u'} to have nearly as high a probability as a single 'q'. But, we wouldn't stop at pairs, we would go on to triples {x, x2, x3} -- in which case {'i','n','g'}, {'t','h','e'} would be high in the league table; of course, the table of probabilities has now 256 x 256 x 256 entries = 16.9 x 10^6 which is getting a bit large! Ex. If, say, 'th' has a probability of 0.03125 (== 1/32 == 2^-5), we can code it with a 5-bit code. On the other hand, if we take single letters, 't' has probability 0.104, and 'h' 0.053; if we round these up to powers of two, we get p('t') = 0.125 == 1/8 == 2^-3, and p('h') = 0.0625 == 2^-4, i.e. for an optimum code, we get a total of 7 bits for the two of them. On the other hand, using p('th') == 0.03125, we arrive at an optimum code of 5 bits for the pair. 2. If the probabilities used to make the code are wrong, then compression performance can drop badly. And, since it is difficult to make such coding schemes adaptive, i.e. make them adapt to changing probabilities, then the coding can stray from optimum for large sections of the massage. For example, in this part of the notes, there is a lot of English, so the probabilities for normal English text may work O.K. However, if we switch to a C program, the probabilities will be wrong. 3. The average symbol length (Lav) of single symbol coding will achieve average entropy only if the probabilities are integer powers of 2; i.e. if we have a symbol i whose probability, pi, is 1/4, we can code it with -log2(1/4) = 2 bits; however, if we have probability, pi = 1/5, -log2(1/5) = 2.3, so we need to round up and use 3 bits. We waste 0.7 bits every time that symbol occurs. - principle: The principle of Lempel-Ziv coding scheme can be seen from the following example. Essentially, in LZ, encoding is performed by parsing the source data stream into the shortest substrings that have not already encountered. The encoder and decoder keep in step and maintain codebooks that are identical. Example. (T.M. Cover and J.A. Thomas, Elements of Information Theory, Wiley, 1991, p. 319). Input data stream: 1011010100010... Assume that the symbols 0 and 1 are already in the codebook. Hence, we can proceed as follows: Sequences stored: 0, 1 Stream to be parsed: 11010100010... Shortest substring not yet encountered: -- Sequences stored: 0, 1, 11 Stream to be parsed: 010100010... Shortest substring not yet encountered: -- Sequences stored: 0, 1, 11, 01 Stream to be parsed: 0100010... Shortest substring not yet encountered: --- Sequences stored: 0, 1, 11, 01, 010 Stream to be parsed: 0010... Shortest substring not yet encountered: --- etc. At the end of the data stream shown, we will end up with a situation as follows, where 'data sent' is simply the shortest- substring-not-yet-encountered coded as: previous-substring, followed by the 'new' symbol Numerical position / codebook index: 1 2 3 4 5 6 7 Substring/codebook entry: 0 1 11 01 010 00 10 Data Sent: 21 12 41 11 21 Binary data sent: 0100 0011 1000 0010 0100 ^ | ---------------------------------+ Here, we see that 1000 has been sent; the 100 represents '4' which tells the decoder where to look in its (identical) codebook (where it will find '01'. - frequently occurring very long strings can be 'learned' by the system, with consequent large savings. - Lempel-Ziv can be done 'on-the-fly', and, hence, is suitable for use in modems; again, the encoder and decoder must keep in step. Note: there are many versions of Lempel-Ziv. Marks: 3 for showing understanding of 'sliding window', sending of 'pointers' / code-book indices, and encoder / decoder keep codebook in step. Possible compensation with previous part. 6. Neural Networks and Pattern Recognition. (a) Give a brief description of the structure, principles, and operation of a multilayer perceptron neural network (i.e. the common feedforward neural network). [10 marks] ANS: - structure: usual diagram showing many inputs x1...xp, outputs y1..ym, weights marked, nodes/neurons, + description / identification [4 marks] - principles: of neuron: weighted summation, threshold, weights etc.; mention of bias etc. Mention of brain analogy, etc. Memory in weights etc. [4 marks] - operation: training, coding of data etc. [4 marks] Repeated waffle: restirct to 6 marks, crisp insight, examples, criticism extra 4. Max. 10 (b) Plot a feature diagram of the two-dimensional feature data shown in Table 6-1. [Hint: as a check, note the class means supplied, and note that the data are distributed symmetrically]. [2 marks] Table 6-1 class x1 x2 ---------------------- 0 0.00 0.00 0 2.00 0.00 0 0.00 2.00 0 0.00 1.00 0 1.00 0.00 0 1.00 1.00 0 1.00 2.00 0 2.00 1.00 0 1.99 1.99 1 2.00 2.00 1 2.00 3.00 1 3.00 1.00 1 3.00 2.00 1 3.00 3.00 1 3.00 4.00 1 4.00 2.00 1 4.00 3.00 Class means Class 1: 0.998889 0.998889 Class 2: 3.000000 2.500000 ANS: | + | 6 * | 5 + * | 4 + * x | 3 + *x x x | 2 0 0 0*x x x | 1 0 0 0 *x | 0--0--0--*--+--+--+--+--> x1 1 2 3 4 - *s show boundary -- answer to parts (c) and (d) (c) Prove, or otherwise verify, that the neuron shown in Figure 6-1 will form a linear classification boundary which cuts the x2 axis at -w0/w2 and the x1 axis at -w0/w1. [5 marks] x1 | +1 \ | \w1 |w0 \ | \ v +--+--+ output x2 w2 | | y ------------+ +----------> | | +--+--+ Figure 6-1 ANS: Recall eqns. 9.3-3 and 9.3-4: sum = sum wi xi (9.3-3) i=0,n where w0 = - T and the summation is now 0,1,2 (n=2); and, eqn. 9.3-4, the neuron fires if sum > 0; thus, eqn. 9.3-3 can be written out in full as: sum = w0.(+1) + w1.x1 + w2.x2 (9.3-5) ( = f(x,w) ) > 0 for fire i.e. 1 output <= 0 for 0 output. Thus, the (sharp) boundary between the ft(x,w) = 1, and ft(x,w) = 0 (ft() is thresholded) is given by: f(x,w) = 0 (9.3-6) i.e. w0 + w1.x1 + w2.x2 = 0 (9.3-7) i.e. w1.x1 + w2.x2 = - w0 (9.3-8) This is a straight line which cuts the x1 axis at -(w0/w1) and cuts the x2 axis at -(w0/w2), see Figure 9.3-7. \ | \ -w0/w2 | \ | \ | \ | \ x2 |class 0 \ class 1 | \ | \ | \ | \ -w0/w1 0 +-------------------\--------> 0 x1 \ Figure 9.3-7 Perceptron Linear Decision Boundary. ----------------------------------------------- Thus, the perceptron can only discriminate between patterns which can be separated by a (single) straight line. Likewise functions, see section 9.3.7: OR and AND can be computed, BUT XOR cannot; see Exercise 9 of chapter 8.10. This was one of the achilles heels that Minsky and Papert successfully attacked. (d) Hence, or otherwise, verify that the weights below provide an appropriate classifying neuron for the data given in Table 6-1; pay particular attention to the points (1.99, 1.99) and (2.0, 2.0). w0 = -29.28 w1 = 9.76 w2 = 4.91 [4 marks] ANS: - either appeal to diagram / boundary (see (b) above) or fill in values into x1.w1 + x2.w2 + w0 > 0 class 1 else class 0 e.g. point (x1 = 4, x2 = 4) 4x9.76 + 4x 4.91 = 39 + 20 - 29.28 > 0 ---> class 1 point (x1 = 1, x2 = 1) 1x9.76 + 1x 4.91 = 9.76 + 4.91 - 29.28 < 0 ---> class 0 Boundary points: point (x1 = 2, x2 = 2) 2x9.76 + 2x 4.91 = 19.52 + 9.82 - 29.28 = 29.34 - 29.28 > 0 ---> class 1 point (x1 = 1.99, x2 = 1.99) 1.99x9.76 + 1.99x 4.91 = 19.42 + 9.77 - 29.28 = 29.19 - 29.28 < 0 ---> class 0 4 marks or pro-rata Answer either (e) OR (f) (e) Give a brief discussion on the ability of neural networks to generalise. [4 marks] ANS: - must be able to interpolate / extrapolate, ie. give answers for data not exactly like trained on, - otherwise just lookup table (f) Give a brief description of neural network training via backpropogation. [4 marks] ANS: - have set of correct inputs, outputs (xi, yi) i = 1 .. no. of training data - feed through xs, compute sum of errors sum(yi-y'i), - work out adjustment to output layer weights to reduce error, - backpropagate ... 7. The following 10 x 10 image Figure 1-1, is a small part of an infra-red satellite image of water (top left) and land (bottom left). [Note, you may have worked with similar data before; in this examination, to ease calculation, the original SPOT satellite data have been divided by 10]. (a) Segment the image using: (i) a thresholding based on the histogram, i.e. first obtain h(z), for z = 0, 1 .. 14; [6 marks] ANS: z = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 h(z) 8 24 2 0 3 1 1 4 2 1 4 4 2 5 3 0 h(z) 10 20 30 +----+----+----+----+----+----+ z=0 |-------- |------------------------ |-- t=3 + |--- 5 |- |- |---- |-- |- 10 |---- |---- |-- |----- 14 |--- - table 3 marks - figure 2 marks - threshold set at t=3 (or therabouts) 1 mark; relevant, insightful, comment 1 or 2 marks. (ii) k-means clustering (k=2); use any shortcut you wish -- e.g. choose initial values for your means, e.g. ma = 1, mb = 10. [6 marks] (iii) Look at your histogram for part (a) again, and comment on the assumption of two classes (k=2). [2 marks] ANS: - there are minor peaks at z=4, 7, 10-11; maybe there are three other classes there - e.g. mud / wet sand, various land-covers / vegetations. (b) Figures 1-2, and 1-3 show further colours / bands. Assuming your segmentation from (a) is correct, what, approximately, are the three-dimensional mean vectors for water, mw, and land ml, respectively; comment on the discriminating power, for water versus land, of the three colours. [4 marks] - means 2 marks - clearly infra-red is the best, the CONTRAST between water and land is greatest in that band; water is a very poor reflector of infra-red, while land with healthy growing vegetation is a very good reflector. 3 marks max. 4 marks 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 2 1 1 1 1 1 2 6 10 1 1 1 4 4 7 12 13 1 4 8 7 7 11 13 11 5 11 8 7 10 13 14 12 10 10 9 11 13 14 14 13 Figure 7-1 Infra-red Image. 1 2 3 3 2 2 2 2 2 3 2 2 2 3 2 1 3 2 3 3 3 2 3 4 3 3 4 4 3 3 5 7 4 6 5 3 3 5 7 7 7 5 3 4 6 7 7 7 6 3 4 6 7 9 9 9 4 3 5 7 8 10 10 10 Figure 7-2 Red Image 3 3 3 4 3 4 3 3 3 3 3 3 3 3 3 2 3 3 4 5 3 3 3 3 4 4 4 4 3 2 5 6 6 4 3 3 3 3 5 6 5 3 3 4 4 5 6 6 3 2 3 4 6 6 6 6 3 2 3 6 6 6 7 6 Figure 7-3 Green Image (c) Discuss the application of k-means clustering (or segmentation in general) to data compression. [7 marks] ANS: Principle & operation: [5 marks] - k-means clustering is essentially same as vector quantisation, - vector quantisation in n-dimensions is analogous to uneven / non- linear quantisation in one dimension, - encoder (compressor) identifies k-means, - these sent as 'codebook', - then data are classified according to proximity to k-means, - and one of k-labels sent, - the decoder (decompressor) receives the label and uses the codebook to look up the mean -- which becomes the decompressed datum. Comments / Criticism: [3 marks] - this is a 'lossy' compression technique, - can be proved that quantisation in n-dims is less lossy than n quantisations in individual dimensions, - depends on what data will be used for, e.g. if it is to be used for land use classification, then maybe the classes are all that are needed. end. Miscellaneous: (b) A discrete memoryless source has an alphabet of seven symbols whose probabilities of occurrence are give in Table 5-1. Assuming Huffman coding, or any other code of your choice that offers high efficiency, compute (i) the code lengths for each symbol, (ii) the average code length, and (iii) explain why, for the data in question, it is possible to design a code whose efficiency is 100%. Symbol: s0 s1 s2 s3 s4 s5 s6 s7 [7 marks] C. Discuss vector quantisation. [Hint: segmentation]. [6 marks] ANS: e.g. colour image (3 bands), more efficient to quantise in 3- dimensional space than 3 times in 1-d. process: segment / classify - make codebook (lookup-table code-to- mean of class) - send codebook (once only) - then send codes Class Test No. 1 20 % of C/W Marks. 12/12/1994 10.10am 11.15am ----------------------------------- A N S W E R S and M A R K I N G ---------------------------------- ********************************** 1. [20 marks] The following 10 x 10 image Figure 1-1, is a small part of an infra-red satellite image of water (top left) and land (bottom left). [Note, you will have worked with such data before, but, to ease calculation, the data have been divided by 10]. (i) Segment the image using: (a) thresholding based on histogram, i.e. first obtain h(z), for z = 0, 1 .. 14; [6 marks] ANS: z = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 h(z) 8 24 2 0 3 1 1 4 2 1 4 4 2 5 3 0 h(z) 10 20 30 +----+----+----+----+----+----+ z=0 |-------- |------------------------ |-- t=3 + |--- 5 |- |- |---- |-- |- 10 |---- |---- |-- |----- 14 |--- - table 3 marks - figure 2 marks - threshold set at t=3 (or therabouts) 1 mark; relevant, insightful, comment 1 or 2 marks. (b) k-means clustering (k=2); use any shortcut you wish -- e.g. choose initial values for your means, e.g. ma = 1, mb = 10. [6 marks] (c) Figures 1-2, and 1-3 show further colours / bands; (c-1) assuming your segmentation from (b) is correct, what, roughly, are the three-dimensional mean vectors for water, mw, and land ml, respectively. [3 marks]. (c-2) Comment on the discriminating power, for water versus land, of the three colours. [2 marks] - clearly infra-red is the best, the CONTRAST between water and land is greatest in that band; water is a very poor reflector of infra-red, while land with healthy growing vegetation is a very good reflector. (d) Look at your histogram again, and comment on the assumption of two classes (k=2). [3 marks] - there are minor peaks at z=4, 7, 10-11; maybe there are three other classes there - e.g. mud / wet sand, various land-covers / vegetations. 0 0 0 0 0 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 2 1 1 1 1 1 2 6 10 1 1 1 4 4 7 12 13 1 4 8 7 7 11 13 11 5 11 8 7 10 13 14 12 10 10 9 11 13 14 14 13 Figure 1-1 Infra-red Image. 1 2 3 3 2 2 2 2 2 3 2 2 2 3 2 1 3 2 3 3 3 2 3 4 3 3 4 4 3 3 5 7 4 6 5 3 3 5 7 7 7 5 3 4 6 7 7 7 6 3 4 6 7 9 9 9 4 3 5 7 8 10 10 10 Figure 1-2 Red 3 3 3 4 3 4 3 3 3 3 3 3 3 3 3 2 3 3 4 5 3 3 3 3 4 4 4 4 3 2 5 6 6 4 3 3 3 3 5 6 5 3 3 4 4 5 6 6 3 2 3 4 6 6 6 6 3 2 3 6 6 6 7 6 Figure 1-3 Green. 2. [20 marks] (a) Draw a scatter plot for the data set given in Figure 2-1; use graph-paper if you wish. [3 marks] ANS: 3 marks for properly annotated axes etc. x1 ^ | | 4 + 2 | 3 + 2 2(m2)2 | 2 + 1 2 | 1 1 1(m1)1 | 0 +----1----+----+----+----+--> 0 1 2 3 4 x0 (b) Explain how you would 'train' a nearest mean classifier on data set given in Figure 2-1, and do it. [4 marks] ANS: training is simply estimating the mean vectors: m1 = 1 m2 = 3 1 3 (c) How, normally, would you test such a classifier -- explain any pitfalls. [4 marks] ANS: you would gather more data - like the training data, but, presumably with different 'noise'; pitfalls: (1) testing on training data, (2) not ensuring test data from as wide a range of conditions as possible. marks 2 + 2. (d) Apply the nearest neighbour classifier to the points 0,2 2,1.5 4,4 [3 marks] ANS: class1 class2 x0 x1 x0 x1 mean 1 1 3 3 data 0 2 0 2 diff 1 1 3 1 sq.dist 1^2+1^2 =2 3^2 +1 = 10 **nearest => class 1 mean 1 1 3 3 data 2 1.5 2 1.5 diff 1 0.5 1 1.5 sq.dist 1+0.25=1.25 1+ 2.25 = 3.25 **nearest => class 1 mean 1 1 3 3 data 4 4 4 4 diff 3 3 3 3 sq.dist 9+9=18 1+1 = 2 **nearest => class 2 (e) Describe the application of a one layer neural network to the classification task (again training data = Figure 2-1) . [Hint: w00 = -1, w01 = -1, w02 = 4] [6 marks] ANS: x0 \ \w00 \ \ +--+--+ output x1 w01 | | y ------------+ +----------> | | /+--+--+ / /w02 / x2 (+1 always - BIAS) - diagram like above + description of: (a) 2 s = sum w0i.xi i=0 2 marks (b) if s > 0 output y = 1 - indicates class 1 otherwise y = 0 - indicates NOT class 1 i.e. cl.2 1 mark (c) Generally, you would have an (output) neuron for each class (though in a two class case one will do), an that class's neuron will 'fire' at 1 for that class. In our example, the neuron for class 2 would be w10 = 1, w11 = 1, w12 = -4 1 mark (d) example application, e.g. 2 marks x = (1,1) 1.(-1) + 1.(-1) + 4 = 2 > 0 -> 1 => class 1 x = (3,3) 3.(-1) + 3.(-1) + 4 = -2 <=0 -> 0 => class 2 x = (2,2) 2.(-1) + 2(-1) + 4 = 0 <=0 -> 0 => class 2 but this one is on the 'borderline' - mention of linear boundary, or similar insightful comment up to 2 x 1 marks - but MAX. 6 label x0 x1 1 : 0.00 1.00 1 : 1.00 0.00 1 : 1.00 1.00 1 : 1.00 2.00 1 : 2.00 1.00 2 : 3.00 2.00 2 : 2.00 3.00 2 : 3.00 3.00 2 : 4.00 3.00 2 : 3.00 4.00 end. University of Ulster, Magee College. Faculty of Informatics. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: / /94 Class Test No. 2 20 % of C/W Marks. 19/12/1994 10.00am-11.00am ----------------------------------- A N S W E R S and M A R K I N G ---------------------------------- ********************************** 1. Image data compression. (i) Explain the terms: ENTROPY or INFORMATION, REDUNDANCY or INDEPENDENCE, in the context of image data compression. [8 marks] ANS: - data compression exploits the fact that you need only store or transmit 'real' 'information', - '(real) information' is strongly related to uncertainty; there is no net flow of information where, previously, no uncertainty existed; you give no 'information' if you tell someone something that he knows already, - formally, the 'size' of a piece of 'information' can be measured in terms of the number of binary-digits (BITS) required to represent it unambiguously, - the number of bits is exactly the number of 'twenty-questions' type questions that need to be asked to unambiguously determine the information, - the measure is zero if there is no uncertainty, and there is no point is sending or storing anything, - the measure is small if there is small uncertainty (small variation), eg. a hexadecimal digit can be expressed in 4 bits, - the measure is large if there is big uncertainty, a big number requires a large number of bits, eg. a list of the populations of Irish towns and cities would range (say) 100 to 1,000,000 (a large variation) and would require about 20 bits for each number, - in sequences of pieces of information, sometimes one piece of information 'gives the game away' as to another piece of information; eg. given the occurrence of the letter 'q', surely 'u' follows; this is variously called 'redundancy' (the 'u' is redundant), 'correlation', and 'dependence', - in signals and images, there usually is dependence (correlation) between neighbouring points, - the bigger the dimensionality (signals: dimensionality = 1, images = 2, moving sequences of images = 3) the more neighbours, so the more correlation, and so the greater the scope for correlation, - various compression methods exploit correlation in markedly different ways: - run-length encoding, - differential (predictive) encoding, - transform coding. - explanation of the formula for entropy. - "independent" symbols. In a message composed of English text, the symbols are not independent: there is varying degrees of dependence between successive symbols (and indeed between groups of them). If the letter "q" has just been transmitted, then the probability of "u" surely increases and the information contained in the next letter decreases accordingly; very little information is contained in the "u" because everybody knew what it was before it arrived. English is more than 50 percent redundant. percent redundancy = (1- Hdep/Hindep).100 where Hdep= average entropy for dependent symbols. Hindep= average entropy for independent symbols. In images redundancy crops up as correlation between neighbouring pixels. Deep down, most data compression algorithms are based on removal of redundancy - don't send parts of the message that are already known - or on non-uniform entropy (and these are related). Transformation coding and predictive coding are examples of redundancy removal - but the redundancy removal is implicit and not too obvious. Marks: parrot recall: 2 x 3 marks insight + 2 x 1 mark (b) (i) Compute the probability density function of the 10 x 10 image given in Figure 6-1 below. Hence, derive the average entropy per symbol / pixel. Derive a Huffman code for this image and compute the average length of the Huffman codewords (averaged over the image) [If you wish, refer to the Huffman algorithm given in the Appendix]. Assuming that the image is originally coded with 3 bit pixels, and, neglecting the space occupied by the code table, derive the compression performance of the Huffman code. Hence compare this compression performance with that of the theoretically optimum source code. Note: log2(0.25) = -2.0, log2(0.2) = -2.32, log2(0.15) = -2.74 (b) (ii) Derive a Huffman code, or some other compression scheme of your choice, for the image given in Figure 6-2 below, and, hence, derive the savings compared to 'plain' 8-bit coding. [Hint: if using Huffman, carefully compare this image with the one in Figure 6-1]. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 Figure 6-1 ---------- 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 122 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 Figure 6-2 ---------- [12 marks] ANS: - Compute the histogram - hence, compute the probability density z 1 2 3 4 5 h[z] 25 25 20 15 15 p[z] .25 .25 .2 .15 .15 1 mark - hence, derive the average entropy per symbol / pixel; H = - # pi.log2(pi) = -[ -2 x 0.25 x 2 - 0.2 x 2.32 - 2 x 0.15 x 2.74 ] = -[ -1 -0.464 -.822 ] = 2.29 bits per pixel 2 marks Note: log2(0.25) = -2.0, log2(0.2) = -2.32, log2(0.15) = -2.74 - derive a Huffman code for this image; z 1 2 3 4 5 h[z] 25 25 20 15 15 p[z] .25 .25 .2 .15 .15 --------- .30 .25 .25 .2 -------- .45 .3 .25 ------ .55 .45 ------- 1 2 marks Hence codewords: z 1 2 3 4 5 h[z] 25 25 20 15 15 p[z] .25 .25 .2 .15 .15 codeword 01 10 11 000 001 length 2 2 2 3 3 2 marks - compute the average length of the Huffman code (averaged over all pixels in the image); p[z] .25 .25 .2 .15 .15 codeword 01 10 11 000 001 length 2 2 2 3 3 Lav = #pi.Li = 2 x .25 x 2 .2 x 2 + 2 x .15 x 3 = 1 + .4 .9 = 2.3 1 mark - compare with 3 bits, - compare with the theoretical optimum, neglecting the space occupied by the code table, theoretical optimum = entropy = 2.29 bits so Huffman code near optimum (in fact, only reason not optimum is quantisation effects) 1 mark (ii) (b) Derive a Huffman code, or some other compression scheme of your choice, for the image given in Figure 6-2 below, and, hence, derive the savings compared to 'plain' 8-bit coding. [Hint: if using Huffman, carefully compare this image with the one in Figure 6-1]. 4 marks ANS: The Huffman coding is esssentially the same: note that image 6-2 has the same distribution of symbols (substitute 1 -> 120, 2 -> 122, 3 -> 25, 4 -> 255, 5 -> 9) so the symbols are the same, and average code length = 2.3 compression = 2.3/8 = 28.75 % Max. marks 12 (allow compensation, especially for insightful comment) 2. (a) Explain how a Discrete Fourier Transform may be used to do frequency analysis of a sampled waveform. [6 marks] The Discrete Fourier Transform (DFT) of the digital signal / sequence x[n] , i=0,1...N-1, is N-1 (3.6-1) X[u] = (1/N) # x[n] exp(-j.2PI.u.n/N) n=0 u=0,1...N-1 Interpretation of DFT: - decomposition of x[n] into a weighted sum of the fundamental sine and cosine signals If x[n] has been produced by sampling at fs = 1/T, the largest frequency present in x[n] is fs/2 (section 3.3.3). Examine eqn. 3.6-1, and look at just the cos part, N-1 (3.6-1) X[u] = (1/N) # x[n] exp(-j.2PI.u.n/N) n=0 cos part: N-1 (3.6-3) Xc[u] = (1/N) # x[n] cos(2PI.u.n/N) n=0 at u=0 : cos() = cos(2PI.(0).n/N) ------- = cos(0) = 1 i.e. we are matching x[n] with a constant (sometimes called the DC component - DC = direct current) signal. at u=1 : cos() = cos(2PI.1.n/N -------- this term varies slowly from cos(0) at n=0 to cos(2PI.(N-1)/N)) = cos(2PI) i.e. just one cycle in the sequence. 3.6.2 Discrete Fourier Spectrum. ------------------------------- As with the continuous Fourier Transform, X(w), X[u] is often also called the SPECTRUM of x[n] - or the DISCRETE spectrum. - this is because the spectrum is worried only about the frequency, not whether it is cosine or sine If you take the modulus of the complex numbers, X[u]=Xc[u]+j Xs[u]: (3.6-4) Xa(w) = |X[u]| = |Xc[u]+jXs[u]| = sqrt(Xc*Xc + Xs*Xs) Xa[u] is called the AMPLITUDE spectrum; Xa*Xa is called the POWER spectrum. 3.6.3 Interpretation. -------------------- The X[u] components near the beginning of the DFT (u=0,1,2,..) correspond to the 'slowly varying' parts of the signal; the components near the middle, the 'fast moving' parts. (b) Consider a CD signal sequence; each channel is sampled at fs=44.1 KHz. Take a one second sequence, ie. 44,100 samples. Show how you would provide a frequency analysis of this sequence and provide output corresponding to 10 bands: B1: 0-2000 Hz, B2: 2- 4Khz, 4-6Khz, ..., B10: 18-20KHz. [7 marks] (1) Do DFT (2) Use eqn. 3.6-4: Xa(u) = |X[u]| = |Xc[u]+jXs[u]| = sqrt(Xc*Xc + Xs*Xs) (3) Take the average of Xa[u] over each band: uhigh Xaave = sum Xa[u] /(uhigh-ulow+1) u=ulow 3.6.4 Frequency Discrimination by the DFT. ------------------------------------------ In general, if we have N samples, the spacing of each frequency sample is fs/N. (3.6-6) df = fs/N The larger the N, the finer is our discrimination of frequency. If N -> infinity, then we have a continuous spectrum, i.e. the continuous Fourier transform. Above N/2 we have no additional information; these terms correspond to the 'negative' frequencies of the continuous transform. (Example 3.6-1.) Consider a CD signal sequence; it (one channel) is sampled at fs=44.1 KHz. Take a one second sequence, i.e. 44,100 samples. Take a DFT of x[n], n=0,1,...44099. We have df = 44100/44100 = 1 Hz, i.e. a sampling of 1 Hz of the spectrum. The highest frequency is N/2; this is fmax = 44100/2 = 22050 i.e. the highest frequency represented (fs/2). (cf. Example 3.6-2). B1: 0 to 2 Khz, so we are interested in the first 2001 frequency samples: X[u], u=0..2001; For every u, you will have X[u] = Xc[u] + j.Xs[u] Now, for every u = 0..2000, apply eqn. 3.6-4: (3.6-4) Xa(w) = |X[u]| = |Xc[u]+jXs[u]| = sqrt(Xc*Xc + Xs*Xs) Xa[u] is called the AMPLITUDE spectrum; - and take the average as given above. (c) You take the DFT of a 8 sample sequence, and get the following: u 0 1 2 3 4 5 6 7 Xc[u] 1 0 0 0 0 0 0 0 Xs[u] 0 0 0 0 0 0 0 0 Answer with explanation: what does the sequence x[n], n=0..7 look like? [3 marks] (d) You take the DFT of a 8 sample sequence, and get the following: u 0 1 2 3 4 5 6 7 Xc[u] 0 1 0 0 0 0 0 1 Xs[u] 0 0 0 0 0 0 0 0 Answer with explanation: what does the sequence x[n], n=0..7 look like? [4 marks] end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1995. Lecturer: J.G. Campbell. Date: / /95 FT. or give t.s. + complex FT, and as k for explanation; then ask for amplitude spectrum; ask about phase as question with bite. Figures (a) and (b) show an image before and after median filtering; give a brief explanation of median filtering and and point out significant ... [10 marks] ------------------------------------------------------------------ University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1995. Lecturer: J.G. Campbell. Date: / /95 ASSIGNMENT 1 -- Some Clues. 1. In my solution, I found it easier to introduce a 'user3' DataLab function as follows. Really it is the same as 'user2', EXCEPT, it has TWO destinations (need these for DFT) and ONE source. Thus, in 'dl.c': "user3", 203, 1, 2, 3, 0, case 203: closegraph(); ret=user3(&im[dd[0]],&im[dd[1]],&im[ss[0]],args); CMbegin(&w,&werr,&pp); break; In addition, if you are going to use 'ipuser31.dl' - see item 8. below, you should change the #define symbolic constant 'NIM' from 10 to 12. 2. Below are skeletons of my solution. 3. Note: the easy way (in one pass) to calculate standard deviation is as follows, assume data are xi, i= 1..N. sum = 0.0; sumsq = 0.0; for i = 1..N do sum = sum + xi; sumsq = sumsq + xi*xi; endfor mean = sum/N; meansq = sumsq/N; /*average of sum of squares*/ variance = meansq - mean*mean; stddev = sqrt(variance); 4. In 'oorf', I have used a threshold based on the standard deviation of the data in the window; just as easily, you could enter the threshold as an argument -- see 'enms' just after 'oorf'. 5. In the DFT and FFT parts, it is easier to use function 'time'; unfortunately it gives a resolution of only seconds. Hence, for the 'fft' I have run it 100 times to get a sensible value. In this case, din't forget to divide results by 100. 6. In using 'rfdft' you will probably find that 4096 is the maximum you can get done -- I suspect that 8192 would take several hours; as it is 4096 will take about five minutes. /*---------------------------------------------------- oorf j.g.c. 05/11/95 out-of-range filter, see JC Image Processing Notes, Chapter 4.14.4 ----------------------------------------------------*/ static int oorf(IMD *g,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valg,sum,sumsq,thr,var,fn; Lint r,c,k,l,n; n=(wh-wl+1)*(vh-vl+1); fn=n; if(n==0)return 1; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0;sumsq=0.0; .... use 'medflt' as a template for the ramainder' return 0; } /*---------------------------------------------------- enms j.g.c. 06/11/95 contrast enhancement based on mean and standard deviation, see JC Image Processing Notes, Chapter 4.7 zout=(zin-meanold).stddevnew/stddevold + meannew ----------------------------------------------------*/ static int enms(IMD *g,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, float mm, float ss) { float val,valf,valg,sum,sumsq,var; Lint r,c,n; n=(rh-rl+1)*(ch-cl+1); if(n<=0)return 1; /*-- first pass: compute mean and std. dev --*/ sum=0.0;sumsq=0.0; ... fill in sumsq/=(float)n; sum/=(float)n; /*averages*/ /*-- variance(x) = average of (x^2) - [average of x]^2 --*/ var=sumsq-sum*sum; var=sqrt(var); /*standard deviation*/ printf("\nOld mean = %f, New = %f\n",sum,mm); printf("Old std. dev. = %f, New = %f\n",var,ss); /*-- second pass: apply transformation --*/ ... fill in return 0; } [I found no errors in this] static void rfdft( float x[], /*- real input -*/ float xro[], float xio[],/*- real and imag. outputs -*/ int N) /*- length -*/ { float pi = 3.1415926,tpi; int n, u; ... etc. /*---------------------------------------------------- rdft j.g.c. 06/11/95 straight computation DFT (real) ----------------------------------------------------*/ static int rdft(IMD *g0,IMD *g1,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch) { float valf,valg,*x,*xrout,*xiout; Lint r,c,n,j; time_t t1,t2; /*actually these are unsigned long*/ n=(ch-cl+1); if(n<=0)return 1; x=rowvec(REAL,0,n-1); if(x==NULL)return 2; /*could not allocate*/ xrout=rowvec(REAL,0,n-1); if(xrout==NULL)return 3; /*could not allocate*/ xiout=rowvec(REAL,0,n-1); if(xiout==NULL)return 4; /*could not allocate*/ for(r=rl;r<=rh;r++){ for(c=cl,j=0;c<=ch;c++,j++){ IMDget(&valf,d,r,c,f); x[j]=valf; } t1=time(NULL); rfdft( ????? ); t2= ?????????; printf("\nTime for dft = %lu\n",t2-t1); for(c=cl,j=0;c<=ch;c++,j++){ IMDput(?????????????, g0); IMDput(?????????????, g1); } } rowvecdel(xiout,REAL,0,n-1); rowvecdel(xrout,REAL,0,n-1); rowvecdel(x,REAL,0,n-1); return 0; } Actually, this is not essential, but it is easier than muccing around with the FFT functions in ff.c. /*---------------------------------------------------- rfft j.g.c. 06/11/95 FFT (real)- for comparison with above DFT ----------------------------------------------------*/ static int rfft(IMD *g0,IMD *g1,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch) { float val,*dt,im,re,fn; Lint r,c,n,j,i; time_t t1,t2; n=(ch-cl+1); if(n<=0)return 1; fn=n; printf("rfft, n= %ld\n",n); if(log2((int)n)==-1)return 1; if(log2((int)n)<2)return 2; if((dt=rowvec(REAL,1,2L*(n)))==NULL)return 10; /*2 for complex*/ for(r=rl;r<=rh;r++){ t1=time(NULL); for(i=0;i<100;i++){ for(j=1,c=cl;c<=ch;c++,j+=2){ IMDget(&val,d,r,c,f); dt[j]=val; dt[j+1]=0.0; /*zero complex part*/ } /* now do fft */ four1(dt,n,-1);/* see Num. Rec. p 411 NB. we use -1 for forward*/ } t2=time(NULL); printf("\nTime for 100 x fft = %lu\n",t2-t1); /* now extract re and cmplx parts*/ for(j=1,c=cl;c<=ch;c++,j+=2){ re=dt[j]/fn; im=dt[j+1]/fn; IMDput(&re,d,r,c,g0); IMDput(&im,d,r,c,g1); } } rowvecdel(dt,REAL,1,2*n); return 0; } /*------------------------------------------ user3 - user specified functions. BSc4 assignment j.g.c. 05/11/95. --------------------------------------------*/ int user3(IM *imd0,IM *imd1,IM *ims,float *args) { float val,val0,val1; Lint dl,rl,cl,dh,rh,ch,d,r,c; /*Lint is equivalent to long*/ IMD imd0d,imsd,imd1d; /*IMage Data records*/ int error,choice; CUB bs,bd1,bd0; /*CUBoids*/ choice=args[0]; /*-- get the bounds of each image --*/ bs=IMbound(ims);bd1=IMbound(imd1);bd0=IMbound(imd0); /*-- are images equal size (bounds)? --*/ if(!CUBsequal(&bs,&bd0))return 1; if(!CUBsequal(&bs,&bd1))return 2; /*-- extract image bounds from CUBoid: dl: lower depth bound, dh: upper bound rl: lower bound of rows rh: upper bound '' cl: lower bound of cols ch: upper bound */ CUBext(&dl,&rl,&cl,&dh,&rh,&ch,bs); /*-- open destination IMage Data blocks for write --*/ imd0d=IMopend(imd0,"w",&error); if(error!=0)return 3; /*cannot open data block*/ imd1d=IMopend(imd1,"w",&error); if(error!=0)return 4; /*cannot open data block*/ /*-- open source IMage Data block for read --*/ imsd=IMopend(ims,"r",&error); if(error!=0)return 5; /*cannot open data block*/ d=dl; printf("user3, choice=%d\n",choice); if(choice==1){ if(oorf(????????????????,d,rl,rh,cl,ch,-1,1,-1,1)!=0)return 20; } else if(choice==2){ if(enms(????????????????????,args[1],args[2])!=0)return 30; } else if(choice==3){ if(rdft ?????????????????????)!=0)return 40; } else if(choice==4){ if(rfft ?????????????? return 50; } else return 10; return 0; /*must return 0 for normal return*/ } 7. Here is a skeleton for ipuser3.dl -- use this for tasks 1, 2 and 3 and 5. /*ipuser3.dl /*j.g.c.05/11/95 config 10 4,0,15,15 0,0,4,0 4,0,15,15 0,0,4,0 4,0,15,15 0,0,4,0 4,0,4,4 0,0,4,0 4,0,4,4 0,0,4,0 4,0,4,2 0,0,4,0 4,0,0,2 0,0,4,0 4,0,0,7 0,0,4,0 4,0,0,7 0,0,4,0 4,0,0,7 0,0,4,0 ina 0 ../matdat/seg1616.mat /*not ideal -- use your own ina 1 ../matdat/e1616.mat /*you make this up ina 3 ../matdat/ms55.mat /*you make up gcos 7 8 ina 5 ../matdat/e53.mat /* you make up dlbend I have left out the processing. 8. Here is ipuser31.dl, which I used to do task 4 -- but, of course, I have left out the processing. /*ipuser31.dl /*j.g.c.06/11/95 config 12 4,0,0,511 0,0,4,0 4,0,0,511 0,0,4,0 4,0,0,511 0,0,4,0 4,0,0,1023 0,0,4,0 4,0,0,1023 0,0,4,0 4,0,0,1023 0,0,4,0 4,0,0,2047 0,0,4,0 4,0,0,2047 0,0,4,0 4,0,0,2047 0,0,4,0 4,0,0,4095 0,0,4,0 4,0,0,4095 0,0,4,0 4,0,0,4095 0,0,4,0 dlbend end. ------------------------------------------------------------- University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: / /94 ASSIGNMENT 1 ---ANSWERS & MARKING ----------- Submission Date: Thurs. 13th October 1994. 5.00pm Marks: 10% of coursework. Title: Introductory Exercises in Image Processing All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I reserve the right to change the marking scheme. 1. The image synthesis systems used by the movie industry use 4096 x 4096 pixel images. [10 marks] (a) assuming 25 frames (images) per second, how much data for a one hour film ? Assume 1 byte per pixel x 3 colours. No. of pixels per image = 4,096 x 4096 = 16 Megabytes x 3 = 48 Mbyte = 50 Mbytes (say) images per sec = 25 pixels per sec = 1250 Mbyte (approx.) per hr = 1250 x 3600 = 172,800 Mbyte = 4,500,000 Mbytes which is an awful lot! (b) High-Definition TeleVision (HDTV) is now being talked about. Why is it difficult to envisage transmission of 4096 x 4096 pixel images ? Assuming digital transmission 1250 Mbytes per sec. implies 10 Giga bits per sec; this would require, at least, 10 GigaHertz bandwidth - which is pretty imposible. Now, assume analogue. Assume that current TV is 512 x 512 pixels per sec; this requires a bandwidth of about 6 MegaHertz The 'data' increase is (4096/512)^2 = 8^2 = 64 Therefore, arguing purely qualitatively, we get a bandwidth requirement of about 384 MegaHertz. If we were to transmit on the UHF band, one channel would use most of the whole band. Actually, HDTV will work at aboy 1200 x 1000 pixels; even still they will have to do a lot of data compression. See chapter 6. 2. Music on CD is digital. CD sampling rate is 44,100 samples per second, amplitude resolution is 16-bits per sample for each of 2 channels (stereo). In modern telephone systems, speech is transferred digitally between major exchanges - here you can get away with 8,000 samples per second, 8-bits per sample, and one channel. [10 marks] (a) Why the difference ? ANS: the difference is that most of the 'information' in speech (mostly - except opera singers!) is in 0 to 4 KHz - thus 8KHz sampling (sampling freq. = 2 x max. freq. present). Also, with 'companding' - a form of data compression, we can reduce to 8 bits amplitude; Note: any audible sound can be coded in about 12 bits, but it is usually convenient to use 16 (2 bytes). (b) What is the saving - over (say) a one hour period. ANS: CD data for 1 sec = 2 x 44,100 x 2 bytes = 176,400 bytes = 176,400 x 3600 bytes per hr = 635 Mbytes per hr. Telephone data 1 sec = 1 x 8,000 x 1 bytes per sec = 8,000 x 3600 bytes per hr = 28.8 Mbytes per hr. 3. (Ex. 1.6-8.) The ground resolution of weather satellites is about 1 to 5 Kilometers. The ground resolution for some spy satellites is reported to be 15 centimeters. [15 marks] (a) A satellite image has a ground resolution of 15 cm. What does a ground resolution of 15 cm mean? Ans. - see notes. Note, if answering this in an exam. USE A DIAGRAM. (b) Using such an image, could a skilled interpreter:- (i) read the number plate of a car? ANS: Assume a letter size of 8 cm x 6 cm, AND, that the width of the 'stroke' in the letters is 1 cm, my usual rule-of-thumb is 2 to 4 pixels per minimum feature size;therefore no chance. I'll set up a DataLab demo. of some letters so you can see. Anyway, when did you last see a number plate on the roof ! Note, in an exam. USE A DIAGRAM. (ii) determine the type of a car? ANS: Assuming that you could resolve the bonnet, and boot, and roof, yes, you could distinguish between saloons, estates and smaller and larger, but not exact makes and models. In a military surveillance situation, a skilled interpreter could probably count tanks, lorries, smaller vehicles, etc. (iii) count the number of cars in a car park? ANS: probably - assuming sufficient contrast between car 'colour' and ground' colour. (iv) evaluate how many people at a football match? ANS: Very unlikely; even though you might have 4 - 6 pixels per person, it would be very hard to separate them. (v) tell where the football was at the time the image was taken? ANS: very unlikely - ball = 1 pixel, see above. 4. (based on Ex. 1.6-10.) You are designing an image processing system to do real-time machine inspection for quality control of denim (jean) fabric. [25 marks] (a) Do some research and suggest a suitable sampling resolution (in pixels per millimeter). ANS: The file used in the DataLab demo 'iptex' (d11.dat) used a sampling of 6 pixels per mm. I reckon we could get away with less, say 3 or 4; but I'll say 10 for easy, order-of-magnitude, calculation, i.e. 100 pixels per square mm. (b) The fabric is produced in 2 metre wide rolls, and it passes the inspection point at 1 metre per second). How many pixels will be generated per second ? ANS: area per sec = 2 metres sq. = 2 x 10^6 mm sq. pixels per sec = 200 x 10^6 (c) Assuming very simple processing - take the 3 x 3 smoothing / averaging mentioned in chapter 1, discuss the feasibility of doing the processing using a commercial microprocessor chip - consider speed performance only. ANS: Assuming a static image, the program will have a set of nested loops something like the following (see Chapter 4.11): for(r=1..NROWS) for(c=1..NCOLS) sum=0.0; /*one per pix */ for(rr=1..3) for(cc=1..3) get pixel (val) /* 9 of these for each pix*/ sum=sum+val; /* 9 of */ valout = sum/9; /*one per pixel*/ put pixel (valout) /*one of these for each pixel */ Therefore, as a MINIMUM - neglecting control variables, we have: 10 memory accesses per pixel; 9 adds 1 divide. Assume p microsec per memory access, 5.p per add, 25.p per divide. Total machine operations per pixel = 10 m.a. + 9 add + 1 div. Total time in microsec. = 10.p + 9.5.p + 25.p = (10 + 45 + 25).p = 75.p per pix There are 200 x 10^6 pixels per sec., Therefore processing per sec. = 200 x 10^6 x 75 x p x 10^-6 = 200 x 75 x p. Therefor to get one processor to do it, 1 = 200 x 75 x p p = 1/(200x75) or 15000 memory accesses per microsec. Which is fairly fast work. Really, all I'm trying to prove is that it is pretty futile to try this sort of processing on a, single processor, serial computer. We (the Signal and Image Processing Group at Magee) are working on parallel neural network solutions for problems of this sort. (d) Estimate the economic viability of such a system. Assume that the company works 3 x 8 hour shifts for a 6 day week, 45 weeks per year. For each shift they use 4 inspectors, each costing 15,000 pounds per year; only two operate at any time, but assume that the other two are doing nothing that is economically significant. Assume that the inspection machinery costs 500,000 pounds, has a lifetime of three years, with an annual maintenance of 50,000 pounds. Assume that human and machine detection performances are the same. ANS: Even assuming no sickness or other absence, we need 12 inspectors. Cost œ15,000 p.a. Total personnel cost = œ180,000 p.a. Machinery accountancy cost p.a. = 500,000/3 = œ167,000 + maintenance = œ 50,000 Total p.a. = œ217,000 I.e. people œ37,000 p.a. cheaper. (e) Mention any additional factors that you consider relevant. Of course, the real problem is in the assumption about detection performance; we would hope that the machine would perform better - and it is difficult to estimate the cost of missed faults: claims from customers, loss of customer loyalty, etc. Also, the cost per person is very low; usually the cost of a person = 3 x salary. end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: / /94 ASSIGNMENT 2 ------------ Submission Date: Wed. 26th October 1994. 9.15am - class. Marks: 10% of coursework. Title: Image Contrast Enhancement All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I reserve the right to change the marking scheme. Task: Compare and contrast the image enhancement techniques: linear contrast stretching, histogram equalisation. In your answer, be careful to mention the type of image, and applications, to which these techniques are particularly suitable. [8 marks] (b) An 8 x 8 3-bit (8-level) image is shown in Figure 1. Compute (preferably using the algorithm given in Chapter 4.8) the grey level transformation, T(z), required for histogram equalisation. Apply T(z) to obtain the enhanced output image. Comment on all your results. [12 marks] 0 1 2 3 5 4 4 3 6 1 2 3 5 4 4 3 6 1 2 3 3 4 4 5 6 3 2 3 5 4 4 3 2 4 2 3 3 4 4 5 2 5 2 3 3 4 4 5 2 5 2 3 3 4 4 5 7 5 2 3 3 5 4 4 Figure 1 --------- (c) What is the probability density function for the image; N.B. you really don't need this for part (b). [2 marks] end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: / /94 ASSIGNMENT 2 - answers + marking. ------------ Submission Date: Wed. 26th October 1994. 9.15am - class. Marks: 10% of coursework. Title: Image Contrast Enhancement All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I reserve the right to change the marking scheme. Task: Compare and contrast the image enhancement techniques: linear contrast stretching, histogram equalisation. In your answer, be careful to mention the type of image, and applications, to which these techniques are particularly suitable. [8 marks] ***ANS: (1) Linear contrast stretch. ---------------------------- Any LINEAR function, from z to z', can be expressed as z' = T(z) = c1.z + c0 i.e. the functional relationship between z' and z is a straight LINE (thus LINEAR). Thus, Figure 1 output z' ----------- may go non-linear here | / | / | / z'1+ / slope of this line is c1 | / | / c0 is where it meets z' axis c0|/ linear here | 0 |______|____________________ 0 z1 input z For any input pixel value z1, find the point on the horizontal (z1) axis, then move vertically to the curve, then horizontally to get z'1 = t(z1). Figure 1 Linear Contrast Stretch -------------------------------- The term c0 signifies a 'shift'. If c1 < 1 then you actually have a 'compress' instead of a stretch; if c1=1 then we have no stretch. If c1>1 then we have a 'stretch'. Hence, as described in the notes: " Suppose, a<=f1(x,y)<=b, ie. the range of f1() is [a,b], and you want the range stretched to z0<=z'<=G, then the transformation is (4.5-2) z'= (zG-z0).(z-a)/(b-a) + z0 = (zG-z0).z/(b-a) + (z0.b-zG.a)/(b-a) scale shift (2) Histogram Transformation /Equalisation ------------------------------------------ Related to the question, the two main points about histogram equalisation (or any histogram modification) are that: (a) we still have a function z' = T(z) (implemented as a lookup function), - so in this respect is compares well with linear stretch, (b) but, the function need not be linear; in fact, we will stretch the parts where the histogram is large (to ensure that we spread the values out more evenly, at least in the average - it will arrange to compensate a big peak, with a big 'hole'; corespondingly, we will compress, where the histogram is small (to ensure that we 'clump the values up' an so have bigger (output) histogram values. Hence, in general the function T(z) will be NON- LINEAR, i.e. simply, its slope will change with z. Both are POINT operations, i.e. no account is taken of the pixel's neighbours. Linear contrast stretch could be based on mean and std. dev. statistics, histog. eq. based on histog. statistics. I.E. both can be made relatively non-subjective. The linear cintrast stretch could be based on one pass of the data; however, if its is based on mean and std. dev. the two passes will be reqd. - i.e. the same as histog. equalisation. Hence, compare: both functions, both point operations, both can be used to increase contrast, both can be non-subjectively based on statistics, both may be two-pass. Contrast: one linear, other possibly non-linear; the histogram equalisation usually gives more subjectively pleasing results. Marks: description of each - enough to enable compare /contrast - preferably using examples and diagrams. 4 - 6 marks. Direct compare / contrast 2 - 4 marks. (b) An 8 x 8 3-bit (8-level) image is shown in Figure 1. Compute (preferably using the algorithm given in Chapter 4.8) the grey level transformation, T(z), required for histogram equalisation. Apply T(z) to obtain the enhanced output image. Comment on all your results. [12 marks] 0 1 2 3 5 4 4 3 6 1 2 3 5 4 4 3 6 1 2 3 3 4 4 5 6 3 2 3 5 4 4 3 2 4 2 3 3 4 4 5 2 5 2 3 3 4 4 5 2 5 2 3 3 4 4 5 7 5 2 3 3 5 4 4 Figure 1 --------- ANS: z 0 1 2 3 4 5 6 7 ----------------------------------------- h(z) 1 3 11 17 17 11 3 1 ------------------------------------------ q(z) 8 8 8 8 8 8 8 8 (1) Form P(z), and Q(w), from p() and q(), z 0 1 2 3 4 5 6 7 ----------------------------------------- P(z) 1 4 15 32 49 60 63 64 Q(w) 8 16 24 32 40 48 56 64 (2) for each z in the input do: (2.1) choose w such that Q(w) is closest to P(z), T(z) = w; (ie. set up lookup table/function) z=0 : must choose T(0) = 0 (closest) z=1 : T(1) = 0 z=2 : T(2) = 1 z=3 : T(3) = 3 z=4 : T(4) = 5 z=5 : T(5) = 6 z=6 : T(6) = 7 z=7 : T(7) = 7 Thus, resulting histogram: z 0 1 2 3 4 5 6 7 ----------------------------------------- h(z) 1 3 11 17 17 11 3 1 ------------------------------------------ q'(w) 4 11 0 17 0 17 11 4 marks 8 or pro-rata Comments: [4 marks] eg. - Notice that q'(w) is only an approximation to q(w), in general, for discrete levels, this will be the case. We could make the equalisation exact by modifying the algorithm to allocate the pixels at input grey level zi, not just to one output level wj, but put some in wj-1, and some in wj+1 - to produce an exact q(w) histogram. - plot of T(z) + comment on stretch at z = 0..2 and at z = 5..7 + compress in middle - comment on 'holes'/'hollows' to compensate for big peaks (17) (c) What is the probability density function for the image; N.B. you really don't need this for part (b). [2 marks] z 0 1 2 3 4 5 6 7 ----------------------------------------- h(z) 1 3 11 17 17 11 3 1 ------------------------------------------ p(z) 1/64 3/64 11/64 17/64 etc. 0.016 0.047 .172 .266 .266 .172 .047 .016 which, except for rounding errors, sum to 1.0. end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: 28/10/94 ASSIGNMENT 3 ------------ Submission Date: Wed. 2nd November 1994. 9.15am - class. Marks: 15% of coursework. Title: Linear and Non-Linear Spatial Operations. All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I reserve the right to change the marking scheme. Task: 1. (a) Compare and contrast the image enhancement techniques: linear spatial filtering, non-linear spatial filtering. In your answer, be careful to mention the type of image, and applications, to which each category of technique is particularly suitable. [8 marks] (b) Use the image in Figure 1 to illustrate the differences between a linear and a non-linear filter. 1 1 1 0 1 1 0 0 0 1 1 1 1 1 1 0 9 1 1 5 5 5 5 5 5 5 0 1 5 5 5 5 5 5 5 1 1 5 5 4 5 4 5 5 0 1 1 9 5 9 5 1 1 1 1 1 1 5 5 5 9 1 1 1 1 1 5 5 5 1 1 1 1 1 1 1 1 1 1 1 1 Figure 1 [12 marks] 2. Using your own example data, show an example of 'linear time (shift) invariant' filtering on a data sequence. [10 marks] end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: 4/11/94 ASSIGNMENT 4 ------------ Submission Date: Wed. 16th November 1994. 9.15am - class. Marks: 25% of coursework. Title: Programming Convolution and Median Filtering. All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I reserve the right to change the marking scheme. For this assignment, I will especially reward comments that show insight about the workings of the filters. Background. ----------- In DataLab, I have inserted slots for 'user' functions, i.e. so that users can fill in their own functions. I have provided templates, so that most of the hassle of programming DataLab is sidestepped. (a) Here are the entries in the DataLab function table: "user1", 201, 2, 1, 1, 0, "user2", 202, 2, 1, 1, 0, (b) here are the calls to the functions: case 201: ret=user1(&im[dd[0]],&im[ss[0]],&im[ss[1]],args); break; case 202: closegraph(); ret=user2(&im[dd[0]],&im[ss[0]],&im[ss[1]],args); CMbegin(&w,&werr,&pp); break; (c) the module 'user' (user.c, user.h) is shown in the Appendix. You will notice that 'user1' merely adds two images to produce a third (the 'argument') is ignored. 'user2' (selection 1) does convolution using eqn 3.8-7; this convolution das been encapsulated in function 'conv1'; selection 2 is meant to do convolution using the alternative form - eqn 3.8-6, but I've deleted the crucial part. Note that these convolutions do NOT use the 'centred' convolution scheme given in chapter 4 - in other words, the convolution mask has indices [0..rowmax-1], [0..cmax-1], instead of [-rowmax/2..+rowmax/2] ... (d) I have also created a 'dl' batch file 'ipuser.dl' that creates images, reads in images and exercises functions user1, and user2. (e) I have created a DOS batch file 'dlmake.bat' that makes a full set of DataLab directories on the C:\scratch directory - this will allow you to develop your own software. References: 1. Course Notes. 2. DataLab Users' Manual. 3. DataLab Programmers' Manual. Tasks: ------ 1. Write a section of code in 'conv2' to do convolution using eqn. 3.8-6. Note: there was a typo in the notes in eqn. 3.8-6 - it is corrected in the eqn. given in 'conv2'. Test to show that (a) conv2 does the same as conv1, (b) that it produces a correct result for 3 x 3 smoothing, (c) that it produces a correct result for Sobel edge enhancement, (d) that it produces a correct result for Prewitt edge enhancement. Note: I expect you to change data for each of the filters - NOT code. Required: 1.1 Software for conv2 - commented if neccessary. 1.2 Data printouts for each of tests (a), (b), (c), (d), (e). - input images f[], h[] - output image g[]. These data should be commented as neccessary. Note: DataLab function 'tps', prints 'images' on the screen, 'tpf' '' '' in a file, 'tpp' '' '' on a printer. 2. (a) Obtain results for 7 x 7 smoothing; (b) verify that the result is correct - correct choice of test data should ease this task. Required: - input images f[], h[] - output image g[]. These data should be commented as neccessary - especially to verify that the result is correct. 3. (a) Write a function 'med33' (based on conv1) that does 3 x 3 median filtering (b) verify that the result is correct - correct choice of test data should ease this task. Required: - software 'med33', - input image f[] - output image g[]. These data should be commented as neccessary - especially to verify that the result is correct. If you like, you can use 'choice=3' in user2 - so that you do not have to alter the main program of DataLab. 4. Obtain results to show that the 3 x 3 median filter is non- linear. 5. (a) Produce a function 'conv1d' that will do 1-dimensional convolution - I suggest adaptation of 'conv1' or 'conv2'. Again, you could use 'choice=4' in user2, so that you won't have to alter the main program, (b) obtain results to show that it is working correctly. Required: - software 'conv1d', - input signals x[], h[] (convolution kernel), - output signal y[]. These data should be commented as neccessary - especially to verify that the result is correct. Appendix. User Module. /*----------------------------------------- user.h - DataLab user specified functions j.g.c. 28/10/94 --------------------------------------------*/ #include "im.h" #ifndef USERH #define USERH /*------------------------------------------- user1 - user specified function. j.g.c. 28/10/94. --------------------------------------------*/ int user1(IM *imd,IM *ims0,IM *ims1,float *args); /*------------------------------------------ user2 - user specified function. j.g.c. 28/10/94. --------------------------------------------*/ int user2(IM *imd,IM *ims0,IM *ims1,float *args); /*----------------------------------------- user.c - DataLab user specified functions j.g.c. 28/10/94 --------------------------------------------*/ #include "user.h" static void add(IMD *imd,IMD *ims0,IMD *ims1,Lint d, Lint rl,Lint rh,Lint cl,Lint ch) { float val,val0,val1; Lint r,c; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ IMDget(&val0,d,r,c,ims0); IMDget(&val1,d,r,c,ims1); val=val0+val1; IMDput(&val,d,r,c,imd); } } return; } static void conv1(IMD *g,IMD *f,IMD *h,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valh,sum; Lint r,c,k,l,kk,ll; char s[80]; /* printf("rl,rh,cl,ch: %ld,%ld,%ld,%ld\n",rl,rh,cl,ch); printf("wl,wh,vl,vh: %ld,%ld,%ld,%ld\n",wl,wh,vl,vh); fflush(stdin); getchar(); */ for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0; /* below equivalent to eqn. 3.8-7, chapter 3 r c g[r,c] = # # f[k,l],h[r-k,c-l] k=r-N+1 l=c-M+1 ^ would be clearer if written as k=r-(N-1) to k = r - 0 Thus, if wl = 0, wh = N-1, we get the same. */ for(k=r-wh;k<=r-wl;k++){ for(l=c-vh;l<=c-vl;l++){ /*printf("r,c,k,l: %ld,%ld,%ld,%ld\n",r,c,k,l); */ IMDget(&valf,d,k,l,f); IMDget(&valh,d,r-k,c-l,h); sum+=valf*valh; } } /* fflush(stdin); getchar(); */ IMDput(&sum,d,r,c,g); } } return; } static void conv2(IMD *g,IMD *f,IMD *h,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valh,sum; Lint r,c,k,l,kk,ll; char s[80]; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0; /* below equivalent to eqn. 3.8-6, chapter 3 wh c=vh g[r,c] = # # f[r-k,c-l],h[k,l] k=wl l=vl */ /* replace next line and fill in */ IMDget(&val,d,r,c,f); sum=val; IMDput(&sum,d,r,c,g); } } return; } /*------------------------------------------ user2 - user specified function. j.g.c. 28/10/94. --------------------------------------------*/ int user2(IM *imd,IM *ims0,IM *ims1,float *args) { float val,val0,val1; Lint dl,rl,cl,dh,rh,ch,d,r,c; /*Lint is equivalent to long*/ Lint dl1,rl1,cl1,dh1,rh1,ch1; IMD imdd,ims0d,ims1d; /*IMage Data records*/ int error,choice; CUB bs0,bs1,bd; /*CUBoids*/ choice=args[0]; choice=choice; /*-- get the bounds of each image --*/ bs0=IMbound(ims0);bs1=IMbound(ims1);bd=IMbound(imd); /*-- is dest. image equal size (bound) to source 0 ? --*/ if(!CUBsequal(&bd,&bs0))return 1; /*-- extract image bounds from CUBoid: dl: lower depth bound, dh: upper bound rl: lower bound of rows rh: upper bound '' cl: lower bound of cols ch: upper bound */ CUBext(&dl,&rl,&cl,&dh,&rh,&ch,bs0); CUBext(&dl1,&rl1,&cl1,&dh1,&rh1,&ch1,bs1); /*-- open destination IMage Data block for write --*/ imdd=IMopend(imd,"w",&error); if(error!=0)return 3; /*cannot open data block*/ /*-- open source IMade Data blocks for read --*/ ims0d=IMopend(ims0,"r",&error); if(error!=0)return 4; /*cannot open data block*/ ims1d=IMopend(ims1,"r",&error); if(error!=0)return 5; /*cannot open data block*/ d=dl; if(choice==1)conv1(&imdd,&ims0d,&ims1d,d,rl,rh,cl,ch,rl1,rh1,cl1, ch1); else if (choice==2)conv2(&imdd,&ims0d,&ims1d,d,rl,rh,cl,ch,rl1,rh1,cl1,ch 1); else return 10; return 0; /*must return 0 for normal return*/ } /*------------------------------------------ user1 - user specified function. j.g.c. 28/10/94. --------------------------------------------*/ int user1(IM *imd,IM *ims0,IM *ims1,float *args) { float val,val0,val1; Lint dl,rl,cl,dh,rh,ch,d,r,c; /*Lint is equivalent to long*/ IMD imdd,ims0d,ims1d; /*IMage Data records*/ int error; CUB bs0,bs1,bd; /*CUBoids*/ args[0]=args[0]; /*to stop warnings! */ /*-- get the bounds of each image --*/ bs0=IMbound(ims0);bs1=IMbound(ims1);bd=IMbound(imd); /*-- is dest. image equal size (bound) to source 0 ? --*/ if(!CUBsequal(&bd,&bs0))return 1; /*-- is dest. image equal size (bound) to source 1 ? --*/ if(!CUBsequal(&bd,&bs1))return 2; /*-- extract image bounds from CUBoid: dl: lower depth bound, dh: upper bound rl: lower bound of rows rh: upper bound '' cl: lower bound of cols ch: upper bound */ CUBext(&dl,&rl,&cl,&dh,&rh,&ch,bs0); /*-- open destination IMage Data block for write --*/ imdd=IMopend(imd,"w",&error); if(error!=0)return 3; /*cannot open data block*/ /*-- open source IMade Data blocks for read --*/ ims0d=IMopend(ims0,"r",&error); if(error!=0)return 4; /*cannot open data block*/ ims1d=IMopend(ims1,"r",&error); if(error!=0)return 5; /*cannot open data block*/ d=dl; add(&imdd,&ims0d,&ims1d,d,rl,rh,cl,ch); return 0; /*must return 0 for normal return*/ } #endif end. ## University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1994. Lecturer: J.G. Campbell. Date: 25/11/94 ASSIGNMENT 4 ------------ A N S W E R S and M A R K I N G --------------------------------- Submission Date: Wed. 16th November 1994. 9.15am - class. Marks: 25% of coursework. Title: Programming Convolution and Median Filtering. PLEASE NOTE THIS: For this assignment, I will especially reward comments that show insight about the workings of the filters. - total marks 110. Tasks: ------ 1. Write a section of code in 'conv2' to do convolution using eqn. 3.8-6. Note: there was a typo in the notes in eqn. 3.8-6 - it is corrected in the eqn. given in 'conv2'. [15 marks] static void conv2(IMD *g,IMD *f,IMD *h,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valh,sum; Lint r,c,k,l,kk,ll; char s[80]; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0; /* below equivalent to eqn. 3.8-6, chapter 3 wh c=vh g[r,c] = # # f[r-k,c-l],h[k,l] k=wl l=vl */ for(k=wl;k<=wh;k++){ for(l=vl;l<=vh;l++){ IMDget(&valf,d,r-k,c-l,f); IMDget(&valh,d,k,l,h); sum+=valf*valh; } } IMDput(&sum,d,r,c,g); } } return; } Test to show that (a) conv2 does the same as conv1, (b) that it produces a correct result for 3 x 3 smoothing, (c) that it produces a correct result for Sobel edge enhancement, (d) that it produces a correct result for Prewitt edge enhancement. Note: I expect you to change data for each of the filters - NOT code. (a) - (d) [4 x 5 marks] I gave 3 marks only to those who did not choose the test data so that the impulse response would be clearly visible(*) - or gave me something special. (*) what was needed was an isolated impulse somewhere in the image. 2. (a) Obtain results for 7 x 7 smoothing; (b) verify that the result is correct - correct choice of test data should ease this task. Required: - input images f[], h[] - output image g[]. These data should be commented as neccessary - especially to verify that the result is correct. [2 x 5 marks] - same comments as for 1(a) -(d). 3. (a) Write a function 'med33' (based on conv1) that does 3 x 3 median filtering (b) verify that the result is correct - correct choice of test data should ease this task. - 25 marks for program: I gave about 20 if I thought it worked but could be more 'understandable', I gave 23 to most people. I would have given an extra 2 to anyone who had 'dynamically' allocated the array space, like I did, see below: - 10 marks for test data; again full marks only for rigorous tests, commented properly; but, also, I think I was easy on people who gave fairly bland results. A few sample pixels, manually worked, showing the major characteristics, are helpful; also, comments about nibbling corners, etc. static int medflt(IMD *g,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valh,sum,*array; Lint r,c,k,l,n,i; n=(wh-wl+1)*(vh-vl+1); if(n==0)return 1; array=rowvec(REAL,0,n-1); if(array==NULL)return 2; /*could not allocate*/ for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ i=0; for(k=r-wh;k<=r-wl;k++){ for(l=c-vh;l<=c-vl;l++){ IMDget(&valf,d,k,l,f); array[i]=valf; i++; } } sort(array,n); sum=array[n/2]; /*should this be ( (n+1)/2 -1 ) ??*/ IMDput(&sum,d,r,c,g); } } rowvecdel(array,REAL,0,n-1); return 0; } 4. Obtain results to show that the 3 x 3 median filter is non- linear. - 10 marks; those who highlighted differences between the two images (med(f1+f2), and med(f1) + med(f2)) got most marks; subtraction helpful. 5. (a) Produce a function 'conv1d' that will do 1-dimensional convolution - I suggest adaptation of 'conv1' or 'conv2'. Again, you could use 'choice=4' in user2, so that you won't have to alter the main program, (b) obtain results to show that it is working correctly. - 15 marks for program, see below; fewer for discussion of applicability of two-d; fewer still for no program or discussion, - 5 marks for test data. static void conv1d(IMD *y,IMD *x,IMD *h,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint ml,Lint mh) { float val,valx,valh,sum; Lint r,c,m,k; char s[80]; wh=wh; /* to stop warnings */ for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0; /* below equivalent to eqn. 3.8-1, chapter 3 m=mh y[c] = sum x[c-m].h[m] m=ml */ k=wl; /*it should be 0, but may not*/ for(m=ml;m<=mh;m++){ IMDget(&valx,d,r-k,c-m,x); IMDget(&valh,d,k,m,h); sum+=valx*valh; } IMDput(&sum,d,r,c,y); } } return; } end. University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1995. Lecturer: J.G. Campbell. Date: / /95 ASSIGNMENT 1 -- ANSWERS and MARKING ----------------------------------- I allocated marks in equal proportion 20% to each question. See below allocation within a question. In many answers, I encountered little in the way of insight, or critical approach, or evidence of a bit of research in a book. Hence, not too many of the marks are over 70%. The marks are a bit bunched, but I cannot do much about that -- many of the answers were of a similar level (not similar, similar level). I will give more comments of this sort in class. Submission Date: Fri. 10th November 1995. 2.15pm - class. (deferred to Wed. 15th Nov. class). Marks: 50% of coursework. Title: Programming Image Processing Operations in DataLab. All answers must be typed and presented neatly bound or stapled - with a proper title page. I will reserve a proportion of the marks for presentation. I will especially reward comments that include insight and criticism. I reserve the right to change the marking scheme. Background. ----------- In DataLab, I have inserted slots for 'user' functions, i.e. so that users can fill in their own functions. I have provided templates, so that most of the hassle of programming DataLab is sidestepped. (a) Here are the entries in the DataLab function table: "user1", 201, 2, 1, 1, 0, "user2", 202, 2, 1, 1, 0, (b) here are the calls to the functions: case 201: ret=user1(&im[dd[0]],&im[ss[0]],&im[ss[1]],args); break; case 202: closegraph(); ret=user2(&im[dd[0]],&im[ss[0]],&im[ss[1]],args); CMbegin(&w,&werr,&pp); break; (c) the module 'user' (user.c, user.h) is available in U:\jc\dl\s You will notice that 'user1' merely adds two images to produce a third (the 'argument') is ignored. 'user2' (selection 1) does convolution using eqn 3.8-7; this convolution das been encapsulated in function 'conv1'; selection 2 is meant to do convolution using the alternative form - eqn 3.8-6. It does NOT use the 'centred' convolution scheme given in chapter 4 - in other words, the convolution mask has indices [0..rowmax-1], [0..cmax-1], instead of [-rowmax/2..+rowmax/2] ... (d) I have also created a 'dl' batch file 'ipuser.dl' that creates images, reads in images and exercises functions user1, and user2. (e) I have created a DOS batch file 'dlmake.bat' that makes a full set of DataLab directories on the C:\scratch directory - this will allow you to develop your own software. (f) Make sure to load 'project-file' 'dl.prj', and ensure that it has directories set appropriately. Note: in 'user.c' you will find the answers to last years equivalent assignment, i.e. median filtering, and various forms of convolution. I have decided to leave them there -- they may be of assistance. References: 1. Course Notes Chapters 3, 4, 9. 2. DataLab Users' Manual. 3. DataLab Programmers' Manual. 4. You may need to consult a text-book regarding 'template- matching'. Tasks: ------ 1. (a) Implement a function 'oorf' -- analogous to 'conv2' -- that performs out-of-range filtering as described in Chapter 4.14.4. (b) Test it with appropriate data (read from file). (c) Compare a 3 x 3 smoothing filter to a 3 x 3 'oorf' filter. Marks: (a) 5, b 5, c 5, comments 5. Required: 1.1 Software for 'oorf' - commented as neccessary. 1.2 Data printouts for each of tests of 'oorf' and comparison with 3 x 3 smoothing - input images f[], - output images g[]. These data should be commented as neccessary. Note: DataLab function 'tps', prints 'images' on the screen, 'tpf' '' '' in a file, 'tpp' '' '' on a printer. ------------------------------------------------------------- ANSWER. (The complete user.c / user.h is given in Appendix A). Some people confused what I meant by a smoothing filter -- I meant 3 x 3 averaging (DataLab function 'esm'), i.e. some used median. Essentially, what I expected you to find here is that with appropriate threshold, 'oorf' is more or less like median -- it excludes outliers. On the other hand, too low a threshold, and it becomes neighbourhood averaging. 1.1. /*---------------------------------------------------- oorf j.g.c. 05/11/95 out-of-range filter, see JC Image Processing Notes, Chapter 4.14.4 ----------------------------------------------------*/ static int oorf(IMD *g,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, Lint wl,Lint wh,Lint vl,Lint vh) { float val,valf,valg,sum,sumsq,thr,var,fn; Lint r,c,k,l,n; n=(wh-wl+1)*(vh-vl+1); fn=n; if(n==0)return 1; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ sum=0.0;sumsq=0.0; for(k=r-wh;k<=r-wl;k++){ for(l=c-vh;l<=c-vl;l++){ IMDget(&valf,d,k,l,f); sum+=valf; sumsq+=valf*valf; } } sumsq/=fn; sum/=fn; /*averages*/ /*-- variance(x) = average of (x^2) - [average of x]^2 --*/ var=sumsq-sum*sum; var=sqrt(var); /*standard deviation*/ thr=var; IMDget(&valf,d,r,c,f); /*centre value*/ if(fabs(valf-sum)>thr) valg=sum; else valg=valf; IMDput(&valg,d,r,c,g); } } return 0; } Here, I have, as an experiment mainly, used a variable threshold based on the statistics of the data in the window, i.e. the standard deviation. Another method might base 'thr' on the standard deviation over the whole image. Test Results. A. Input Image -- Rectangle with Random Spots(impulse noise) added, i.e. using ipuser3.dl -- see Appendix B. then grect 0 4,4,10,10 grsa 0 user3 etc... Rows: 0 - 15, Cols: 0 - 15, Dims: 0 - 0. Number of classes: 1 Dim: 0 [0, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 [1, 0] 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [2, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 0.00 [3, 0] 0.00 0.00 1.00 0.00 0.00 1.00 0.00 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 [4, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 1.00 0.00 0.00 [5, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [6, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [7, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [8, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [9, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 1.00 0.00 [10, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [11, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 [12, 0] 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [13, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [14, 0] 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [15, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Output Image -- oorf: [0, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.00 0.00 [1, 0] 0.00 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [2, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.11 0.00 0.22 0.00 0.00 0.00 0.00 [3, 0] 0.00 0.00 0.11 0.00 0.00 0.44 0.56 0.44 0.56 0.00 0.56 0.33 0.00 0.00 0.00 0.00 [4, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.11 0.00 0.00 [5, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [6, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [7, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [8, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.00 0.00 [9, 0] 0.00 0.00 0.00 0.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.00 0.00 0.00 0.11 0.00 [10, 0] 0.00 0.00 0.00 0.00 0.44 1.00 1.00 1.00 1.00 1.00 0.44 0.00 0.00 0.00 0.00 0.00 [11, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.44 0.00 0.00 0.00 0.00 0.00 0.00 0.11 0.00 [12, 0] 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [13, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [14, 0] 0.00 0.00 0.00 0.00 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [15, 0] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Output Image -- Smoothing (esm -- 3 x 3): [0, 0] 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.00 0.11 0.11 0.22 0.22 0.22 0.11 0.00 0.00 [1, 0] 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.00 0.11 0.11 0.22 0.22 0.22 0.11 0.00 0.00 [2, 0] 0.22 0.22 0.22 0.11 0.11 0.11 0.22 0.11 0.22 0.11 0.33 0.22 0.22 0.00 0.00 0.00 [3, 0] 0.11 0.11 0.11 0.22 0.33 0.44 0.56 0.44 0.56 0.44 0.56 0.33 0.33 0.11 0.11 0.11 [4, 0] 0.11 0.11 0.11 0.33 0.56 0.78 0.89 0.78 0.78 0.67 0.56 0.33 0.22 0.11 0.11 0.11 [5, 0] 0.00 0.00 0.00 0.33 0.67 1.00 1.00 1.00 1.00 1.00 0.67 0.33 0.11 0.11 0.11 0.11 [6, 0] 0.00 0.00 0.00 0.33 0.67 1.00 1.00 1.00 1.00 1.00 0.67 0.33 0.00 0.00 0.00 0.00 [7, 0] 0.00 0.00 0.00 0.33 0.67 1.00 1.00 1.00 1.00 1.00 0.67 0.33 0.00 0.00 0.00 0.00 [8, 0] 0.00 0.00 0.00 0.33 0.67 1.00 1.00 1.00 1.00 1.00 0.67 0.33 0.00 0.11 0.11 0.11 [9, 0] 0.00 0.00 0.00 0.33 0.67 1.00 1.00 1.00 1.00 1.00 0.67 0.33 0.00 0.11 0.11 0.11 [10, 0] 0.00 0.00 0.00 0.22 0.44 0.67 0.78 0.78 0.78 0.67 0.44 0.22 0.00 0.22 0.22 0.22 [11, 0] 0.11 0.11 0.00 0.11 0.22 0.33 0.44 0.44 0.44 0.33 0.22 0.11 0.00 0.11 0.11 0.11 [12, 0] 0.11 0.11 0.00 0.00 0.00 0.00 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.11 0.11 0.11 [13, 0] 0.11 0.11 0.00 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [14, 0] 0.00 0.00 0.00 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 [15, 0] 0.00 0.00 0.00 0.11 0.11 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Comments. --------- Really, see Pratt p. 292, the out-of-range filter should remove 'outliers', i.e. points that are clearly quite a way different from their neighbours, i.e. just like median filters. In the results, you can see the known effect of 'smoothing' -- it merely smears the impulses. 'oorf' seems to do better, but, like the median filter, it has problems near edges. ------------------------------------------------------------------- 2. (a) Implement a function 'msstr' that performs the statistical contrast enhancement given by equation 4.7-1, chapter 4.7. The desired mean (m') and standard deviation (s') should be input, by the user, as arguments. (b) Carry out tests to verify its performance. Required: 2.1 Software for 'msstr' - commented as neccessary. 2.2 Data printouts for each of your tests. These data should be commented as neccessary. Marks: 2.1 10, 2.2 data 5, comments 5. Software: (sorry about name change). /*---------------------------------------------------- enms j.g.c. 06/11/95 contrast enhancement based on mean and standard deviation, see JC Image Processing Notes, Chapter 4.7 zout=(zin-meanold).stddevnew/stddevold + meannew ----------------------------------------------------*/ static int enms(IMD *g,IMD *f,Lint d, Lint rl,Lint rh,Lint cl,Lint ch, float mm, float ss) { float val,valf,valg,sum,sumsq,var; Lint r,c,n; n=(rh-rl+1)*(ch-cl+1); if(n<=0)return 1; /*-- first pass: compute mean and std. dev --*/ sum=0.0;sumsq=0.0; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ IMDget(&valf,d,r,c,f); sum+=valf; sumsq+=valf*valf; } } sumsq/=(float)n; sum/=(float)n; /*averages*/ /*-- variance(x) = average of (x^2) - [average of x]^2 --*/ var=sumsq-sum*sum; var=sqrt(var); /*standard deviation*/ printf("\nOld mean = %f, New = %f\n",sum,mm); printf("Old std. dev. = %f, New = %f\n",var,ss); /*-- second pass: apply transformation --*/ for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ IMDget(&valf,d,r,c,f); if(var<0.0001)valg=valf; /*-- traps zero input std. dev. --*/ else valg=(valf-sum)*ss/var + mm; IMDput(&valg,d,r,c,g); } } return 0; } Data from Tests. ---------------- Here I simply used the example we had in class. And, I gave it 'new' values of m' = 128, and s' = 40. Note formula: variance = mean-of-squares - mean-squared = mean(pixel^2) - {mean(pixel)}^2 Input data. Class 1: 6.000000 Class Correlation Matrices --------------------------- Class 1: 44.000000 -- this is actually the mean-of-squares (of the input data). so variance = 44 - 6^2 = 44 - 36 = 8 stddev = sqrt(8) = 2.828 (i.e. sqrt(2) x 2) i.e. this confirms the result we got in the class [0, 0] 2.00 2.00 2.00 2.00 2.00 [1, 0] 4.00 4.00 4.00 4.00 4.00 [2, 0] 6.00 6.00 6.00 6.00 6.00 [3, 0] 8.00 8.00 8.00 8.00 8.00 [4, 0] 10.00 10.00 10.00 10.00 10.00 Output Data. ----------- Means. Class 1: 128.000015 -- i.e. what was required. Class Correlation Matrices --------------------------- Class 1: 17983.996094 -- this is mean-of-squares so, variance = 17984 - mean-squared = 17984 - 128^2 = 17984 - 16384 = 1600 stddev = sqrt(1600) =40 i.e. as required. Data Ranges ----------- Min: 71.43 Max: 184.57 Dim: 0 [0, 0] 71.43 71.43 71.43 71.43 71.43 [1, 0] 99.72 99.72 99.72 99.72 99.72 [2, 0] 128.00 128.00 128.00 128.00 128.00 [3, 0] 156.28 156.28 156.28 156.28 156.28 [4, 0] 184.57 184.57 184.57 184.57 184.57 Comments: See data, i.e. we have demonstrated that, for this data file, the algorithm works OK. What is the significance of the m' = 128, and s' = 40. Well, if the data are to be displayed on a grey-scale of 0..255, we would like the data to fill this range -- and symmetrically. Hence, a mean at the centre (128) and spreading out to 0 and 255 would be ideal. If the data are distributed according to a Gaussian distribution ( as might be expected) than we would expect the data to extend mean +/- 3 x stddev i.e. 128 - 120 to 128 + 120 i.e. 8 to 248 However, the data given are NOT Gaussian, so the spread is less. I used 'ggn' on a 16 x 16 image, and then this contrast stretch, and I got 25 to 230. ------------------------------------------------------------------ 3. (a) Implement the function 'rfdft' given in Chapter 3.6. Note I have not tested this, so there may be minor errors. void rfdft( float x[], /*- real input -*/ float xro[], float xio[],/*- real and imag. outputs -*/ int N) /*- length -*/ { float pi = 3.1415926,tpi; int n, u; tpi=2.0*pi; for(u=0;u