University of Ulster, Magee College. BSc. Applied Computing (1220), Year 4. Course: Image Processing (AC460). Autumn Term 1995. Lecturer: J.G. Campbell. Date: / /95 4. Image Enhancement -------------------- 4.1 Introduction. ----------------- Whenever an image passes through a 'system', i.e. gets sensed, and/or digitised, and/or compressed & decompressed, and/or transmitted, and/or displayed, the 'quality' of the image may be degraded in some way - for example, the addition of noise. Image enhancement is about the improvement of the 'quality' of the image - usually an attempt at returning it to some 'original' state. Of course, quality is subjective - it depends on what the (human or machine) user/observer intends to use the image for. Indeed, the original image may be of low quality from a few points of view, but we are content with this, and all we want to do is highlight (enhance) some features of the image - i.e. improve the quality with respect to one particular criterion. In general, the improvement of quality - no matter what the criterion for improvement - will usually be called IMAGE ENHANCEMENT; and so to some extent this chapter contains a fairly mixed bag of techniques and processes. In this chapter we will cover two broad categories of image enhancement: point (or pixel, or grey level) operations, and spatial operations (or neighbourhood operations). (1) Point operations. -------------------- We apply some function to each pixel value, individually, to yield a new value; there is a net enhancement effect on the image viewed as a whole, even though the global relationship between pixel values is ignored. In this category, the enhancement is usually for the benefit of a human observer, and the enhancement is often connected with contrast enhancement (e.g. turning up the contrast on a TV monitor - see section 2.7 for a definition of contrast). We differ from Gonzalez & Woods in calling these POINT operations - they (incorrectly, and in disagreement with most authors) call them spatial operations. (2) Spatial operations. ---------------------- Here we apply a function to a pixel and its neighbours, to yield a new value for the pixel; the function takes into account the relationship between the value of the pixel and the values of its neighbours - thus SPATIAL or NEIGHBOURHOOD operations. Broadly speaking, spatial operations come in two categories: smoothing, and sharpening. In general this chapter will concentrate on the 'enhancement' operations, e.g. smoothing out noise, enhancement of edges. We will leave it to later chapters and lectures to apply further processing to the enhanced images, e.g. application of a threshold to an edge-enhanced image, to produce edge points, tracing a path of edge points, segmentation etc. But first a section on noise. 4.2 Noise and Degradation ------------------------- Figure 4.1-1, shows an image passing through a system. At any stage in the system 'noise' may be added. Here we are using 'noise' in a very general sense - i.e. any degradation. It could be the type of noise mentioned in Chapter 1 - i.e. random numbers added to the actual image numbers; it could be thumbprint on a photograph, or grain, or streaks on a negative, or any form of sensor noise, or as on a TV electrical noise, caused by, e.g., electromagnetic noise from a starter motor that gets onto the TV transmission signal. Light Voltage Numbers Scene-----Sens------Digit-------Code---Transmit--Dec--Disp or ise ode lay ^ ^ ^ ^ ^ ^ | | | | | | Noise Noise Noise Noise Noise Noise Figure 4.2-1 Image passing through a System ------------------------------------------- In general, NOISE is data that we don't want! --------------------------------- Additive Noise: -------------- More often than not, noise is additive, i.e. the 'noisy' image is the TRUE, noiseless, image with noise added; thus (4.2-1) f(r,c) = ftrue(r,c) + n(r,c) where ftrue(.,.) is the true, noiseless image n(.,.) is a noise image, e.g. as generated by DataLab 'ggn'. Incidentally, not only is additive noise the most common, but it is also the most easy analysed and dealt with. Reduction of additive noise is covered in Section 4.10. Non-additive Noise: ------------------ Some of image RESTORATION deals with removal of non-additive noise; examples are: noise 'convolved' into an image - blur is a good example; gain calibration errors on a CCD camera is an example of multiplicative noise (N.B. bias is additive). If you think of an audio channel as conveying a sequence of data (albeit analogue), then hiss, crackle, or clicks are all undesirable data ADDED to the 'wanted' data - the voice or music - i.e. noise. The noise obscures the wanted data. The noise that causes hiss on an audio channel shares many characteristics with the noise that causes 'snow' on a TV screen; and both are very similar to the image noise shown in Chapter 1. It is not surprising, therefore, that many noise reduction processes in image processing have been borrowed from analogue (and, more recently, digital) signal processing. POINT OPERATIONS ---------------- 4.3 Grey Level Mapping by Lookup Table. --------------------------------------- The simplest form of point operation. Suppose you have an image whose data is all squashed in the range 0 to 10; it won't look like much when displayed on a 0-255 display; it will have low contrast and look grey and muddy. A simple solution is to form and apply a lookup table as follows: #define MAX 11 int lut[MAX]={0,25,50,...250}; #define NROW 64 #define NCOL 64 unsigned char f1[NROW][NCOL],f2[NROW][NCOL]; int rl,rh,cl,ch,r,l; float val;int ival,ival2; /* read in image...*/ ..... rl=0;rh=NROW-1; cl=0;ch=NCOL-1; for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ ival=f1[r][c]; if(ival<=0)ival2=lut[0]; else if(ival>MAX)ival2=lut[MAX]; else ival2=lut[ival]; f2[r][c]=ival2; } } The example above is rather trivial, since multiplying by 25 would do the same. But, lookup tables are useful in a number of circumstances: (1) it may be desirable to leave the original image unaffected and send the data to display via a LUT, (2) in most hardware implemented image processing machines, there is a LUT between the memory and display; it may start off as {0,1,2,3...255} i.e. 'straight-through', but it can be modified by user or program, (3) especially in colour, it may be useful to have a selection (palette) of LUTs, cf. changing the palette on an IBM PC, (4) any 'point' operation may be implemented by LUT, (5) they are easy to specify interactively - and you see the results immediately, (6) they can be used for application of non-linear contrast adjustments, see Figure 4.3-1 output z' .. / | . / | / non-linear here | / | ____ constant here | / z'1| / | / linear here | / 0 |/__________________________ 0 z1 input z For any input pixel value z1, find the point on the horizontal (z1) axis, then move vertically to the curve, then horizontally to get z'1 = t(z1). Figure 4.3-1 Non-linear Contrast Adjustment ------------------------------------------- 4.4 Colour Lookup Tables ------------------------ (e.g. palettes in graphics). Suppose you wanted to do 'pseudo-colour' coding - just as in the radar rainfall maps in the TV weather forecast. E.g. code 0 as black, ... 5 as deep blue ... 100 as red ... 255 as bright white, etc. Assume 16 levels for red, green, and blue (typical for a cheaper system - i.e. in total a 4096 colour palette; top of range would have 256, 256, 256). The following gives the gist: (the Red, Green, and Blue values can each be e.g. 0..63, same as IBM PC with SuperVGA) Red Green Blue int lut[3][256]={ 0, 0, 0, /*0,0,0 = black*/ 0, 0, 10, /*add some blue*/ 0, 0, 20, /*and some more*/ ... 0, 10, 0, /*now green*/ 0, 20, 0, /*and some more*/ ... 0, 63, 0, /*very bright green*/ 15, 0, 0, /* red*/ ... 63, 0, 0, /*bright red */ 10, 10, 10, /*dark grey*/ 30, 30, 30, /*lighter grey*/ 63, 63, 63 /* bright white*/ } Now, apply LUT: for(r=rl;r<=rh;r++){ for(c=cl;c<=ch;c++){ ival=f1[r][c]; for{b=0;b<=2;b++}{ f2[b][r][c]=lut[b][ival]; /*f2 is the output colour image*/ } } } Again, as noted in section 4.3, LUTs are: (1) easy to do in hardware, (2) easy to specify interactively. Exercises: [see Chapters 1 and (especially) 2 for some discussion of colour]. Ex. 4.4-1. Describe, in English, what colour would appear for: (a) red=63, green=63, blue=0. (b) red=5, green=5, blue=5. (c) red=63, green=63, blue=63. (d) if red=30, green=30, blue=50 is light blue, what is red=50, green=30, blue=30 Hint: think additive (see Chapter 2). Ex. 4.4-2. Suggest, in English, a colour coding scheme for coding magnitude (examples: temperature on a map, altitude on a map, rainfall, etc.). 4.5 Grey Scale Transformation. ------------------------------ (Just a slight generalisation on chap 4.3). A lookup table is nothing more than a discrete function; again, the terms function and transformation are used interchangeably. Expressed as a function, the transformation from input image to output image is: (4.5-1) z' = t(z) where z' is output pixel value z input value i.e. z={0..zG}, z'={0..zG}, replacing G of chapter 1 with zG, t(.) transformation function. Suppose, as in chapter 4.3, the input image does not occupy the full dynamic range, e.g. an "underexposed" photograph. Suppose, a<=f1(x,y)<=b, i.e. the range of f1() is [a,b], and you want the range stretched to z0<=z'<=G, then the transformation is (4.5-2) z'= (zG-z0).(z-a)/(b-a) + z0 = (zG-z0).z/(b-a) + (z0.b-zG.a)/(b-a) scale shift Eqn. 4.5-2 is a simple linear transformation which stretches and shifts the original grey scale [a,b] to [z0,zG]. If you are familiar with graphics transformations you will recognise a one-dimensional shift and scale. Now suppose that, MOST of the input pixels are in the range [a,b], and those outside are only due to noise, or some uninteresting artefact. Then we can use (4.5-3) z'= (zG-z0).(z-a)/(b-a) + z0 , for a<=z<=b = z0 , for zb This stretches [a,b] as before, but COMPRESSES intervals [z0,a] and [b,zG] into single points, z0, and zG, respectively. In addition, there is nothing to stop you applying different linear transformations to many regions of the grey scale, see exercises. Example 4.5-1: Let a=0, b=10, z0=0, and zG=255 - we want to stretch the input range [0,10] to an output range of [0,255]. Thus eqn. 4.5-2 becomes: z' = (255-0).(z-0)/(10-0) + 0 = 25.5 * z I.e. we need no shift, and a scaling by a factor of 25.5. Ex. 4.5-2: Let a=15, b=35, z0=0, and zG=255 - we want to stretch the input range [15,35] to an output range of [0,255]. (a) Work out the transformation, (b) apply it to the input values 15, 25, 35; does it check out OK - it should be very clear to you what each of these values should map to. Ex. 4.5-3: Let a=0, b=1023, z0=0, and zG=255. (a) Work out the transformation, (b) apply it to the input values 0, 255, 511, 1023. If an input image has been subjected to a known grey scale transformation, t(.), then applying the inverse, t-1(), can remove the effect. E.g. In an X-ray image the image values are known to follow the rule: (4.5-4) f(x,y) = I0. exp(-c(x,y)) where I0 = intensity of X-ray beam c(x,y) is a function of the density and thickness of the X-rayed material. The inverse of exp(.) is ln(.) (natural log, log base e), so that if you apply a ln(.) transformation to f(x,y) you get an image which is proportional to c(x,y) - which may be more useful in some cases. Log is also generally useful where you want to stretch out the lower pixel values, while compressing the upper. Ex. 4.5-4. Plot a graphical representation of a log2 (base 2) transformation for [0..255]. Verify that the previous statement is true - stretch ... compress. 4.6 Thresholding and Slicing. ----------------------------- Suppose you have a monochrome satellite image of the earth. Water is fairly dark - all water pixels less than bw (say), and the land brighter (>bw). Then the thresholding function: (4.6-1) z' = 0 for zbw will clearly segregate land and water. The requirement to reduce a quantity to binary (true - false, yes - no, X - notX) is common in image processing, as in many computer applications; we shall come across this again - especially in image segmentation. Thresholding generalises to density slicing. Suppose in the same satellite image water values are [aw,bw], urban landuse [au,bu], grassland [ag,bg], forestry [af,bf], and none of the ranges overlap. The rule given in eqn. 4.6-2 forms the basis of a land-use mappping algorithm: (4.6-2) z'=1 for aw<=z<=bw (class 1 = water) z'=2 for au<=z<=bu (class 2 = urban) ... z'=4 for af<=z<=bf (class 4 = forestry) z'=0 otherwise. Further, if the pixels are 20 metres x 20 metres, then counting all the 4s in the output image (=nf, say) and multiplying by 0.04, will give the number of hectares of forestry in the image. [Why 0.04 ?] Also, you can apply a colour lookup table (see section 4.4) to produce a colour map of the result. 4.7 Contrast Enhancement Based on Statistics. --------------------------------------------- Sometimes we want a non-subjective method of equalising the average grey level and contrast of a number of images; the method given in this section and the next (histogram modification) meet this aim. Suppose we have an input image whose mean grey level is m and standard deviation is s. We desire m' and s'. This is simply shifting and scaling again (see chap 4.5) and the transformation given in eqn. 4.7-1 produces an output image with the desire mean and std. dev. (4.7-1) z' = (z-m) . s'/s + m' ^ ^ ^ | | | shift scale shift to mean = m' to mean = 0 Compare eqn 4.5-2 (4.5-2) z'= (zG-z0).(z-a)/(b-a) + z0 Some definitions: Mean, m: (4.7-2) m = #(all pixels) f(r,c)/MN i.e. r=0..N-1 c=0..M-1 I.e. the average grey level for whole image (In this chapter, and following ones we will use '#' to denote summation, i.e. replacing the Greek capital sigma, that is normally used. Please replace accordingly). Variance, v: (4.7-3) v = # (f(r,c) - m)**2)/MN r=0 to N-1 c=0 to M-1 I.e. the average squared deviation from the mean - a measure of the contrast present in the image; small v - low contrast. Standard Deviation, s: (4.7-4) s=sqrt(v) Ex. 4.7-1. An input image has mean m=5, standard deviation s=3. Derive the contrast stretching transformation to produce output image with m'= 128, s'=80. Ex. 4.7-2. Input image: 2 2 2 2 2 4 4 4 4 4 6 6 6 6 6 8 8 8 8 8 10 10 10 10 10 (a) What is mean, m? (b) What is standard deviation, s? (c) Derive the transformation to yield mean, m'= 128, std.dev s' = 80. (d) Then compute the output image. (e) Will it fit into [0,255]? If not, 'clip' < 0 to 0 and > 255 to 255. 4.8 Histogram Modification. -------------------------- First some definitions. Histogram: ---------- A table, with an entry for each possible pixel value, giving the number of pixels which have that value; usual notation: h(z) , z=[z0..zG]. (4.8-1) # h(z) = M.N for an M x N image. z=z0 to zG Ex. 4.8-1. Input image 2 2 2 2 2 4 4 4 4 4 6 6 6 6 6 8 8 8 8 8 10 10 10 10 10 Thus: h(0) = 0, h(1) = 0, h(2) = 5, h(4) = 5,... h(10) = 5, all others = 0. A histogram is usually shown as a graph as follows (graph on its side): 0 5 -----------------------> p 0| 1| 2|***** | |***** | |***** | |***** | 10|***** z Figure 4.8-1 Histogram ---------------------- Probability Density Function (pdf): ----------------------------------- [Note: Wheras we deal mostly with discrete numbers, e.g. the integers between 0 and 255, probability density function normally refers to the probability of REAL numbers; strictly we should use the term probability function for discrete. However, it is convenient to use pdf for either] If you divide h(z) by M.N for all z=[z0,zg], you get something which gives an estimate of the probability distribution (or probability density) of the grey levels in the image: (4.8-2) p(z) = h(z)/(M.N) (4.8-3) # p(z) = 1 (from definition of z=z0 to zG probability) The probability density p(zi) lets you know, for a stream of pixels coming from an image, the probability that the pixel value will be zi. In the example given above, p(0)=0, p(1)=0, p(2) = 1/5,...p(10)=1/5, p(11)=0... One of the motivations behind the invention of probability theory was that of gambling. If you knew that all images followed the previous density, then it would be foolish to bet on the value 12, or 0 (say), coming up. Ex. 4.8-2. You will find that the noisy image shown in Chapter 1 has a histogram something like: 0 -----------------------> p 0| | .... |* |***** 60 |*********** |***** |** |* | .... |* |***** 160|*********** |***** |** |* | z Figure 4.8-2 Histogram of Noisy Image ------------------------------------- Ex. 4.8-3. What is the histogram of the clean cross image in Chapter 1? Ex. 4.8-4. A fair dice is thrown 2400 times. What, approximately, will be the histogram? Ans: h(1) = 2400/6 = 400 h(2) = 400, ... h(6) = 400. Ex. 4.8-5. What is the probability density for any single number (not a 6 number 'play') on a lottery with numbers labelled 1 to 39? Assume fair! Ex. 4.8-6. (a) What is the probability of the numbers (1,5,11,15, 22, 33) coming up in a six-number lottery -- numbers labelled 1 to 39? (b) Hence, at what stage does the lottery become a good bet? i.e. better than evens? ANS: when the pool becomes better than about 1,000,000 pounds. Why? [Note: this was originally written before the so-called bonus number appeared!]. Cumulative Distribution Function: --------------------------------- The cumulative distribution function (cdf), F(z), gives the accumulated probability from minus infinity (or z0 in our case) up to z, i.e. (4.8-4) F(z') = # p(z) z= z0 to z' Thus, F(zG) = 1. Ex. 4.8-7. (a) Prove F(zG) = 1. (b) Prove F(z0) = p(z0). Histogram Transformation. ------------------------ Suppose you had an image with a histogram as shown in Figure 4.8-3, i.e. with a severely compressed dynamic range. Shown on a 255 grey level display this would look 'flat' - all dull grey, 'muddy' - with no detail evident. You could use one of the contrast enhancement methods mentioned above OR you could set up a lookup table based on the requirement to modify the histogram to a more suitable shape, q(z), instead of p(z). In what follows, we will consider arbitrary q(z), but often we will want HISTOGRAM EQUALISATION, i.e. a flat histogram, or UNIFORM probability distribution function. Histogram equalisation usually gives a satisfactory contrast enhancement. (4.8-5) q(z) = 1/G for z=z0...zG where G = number of possible grey levels. 0 -----------------------> p 0|**** |** |* |***** 4 |*********** |***** |** z |* 8 |** ... | | | 255| Figure 4.8-3 Histogram - compressed dynamic range ------------------------------------------------- Although there are no values above 8 in Figure 4.8-3, a similar state of affairs can exist if you have a few values all over the range 0 to 255, but with most of them clustered at one small sub-range. Although a linear contrast stretch will not work in this latter case (see eqn 4.5-2) (eqn. 4.5-3 addresses this problem), histogram equalisation will still work. The treatment here differs from that in [G&W] (which is more mathematical); I have taken it from [Lim, chap 8.1]. See also [Rosenfeld & Kak chap 6.2.4]. You will also find contrary treatments in Boyle and Thomas, and Low; these work ONLY for histogram equalisation, wheras the method given below will work for ANY desired histogram, including UNIFORM (equalisation). Notation: Input grey levels z, Output grey levels w, Transformation (lut) w = T(z), Input probability distribution p(z), cdf P(z) Desired (output) distribution q(w), cdf Q(w) Np = number of pixels in image. Algorithm: ---------- (1) Estimate frequency of grey levels (histogram), ph(z). (2) From ph(z) compute p(z) = ph(z)/Np. (3) Form cdfs P(z), and Q(w), from p() and q(), respectively, z' e.g. P(z') = # p(z) z=z0 (4) for each z in the input do: choose w such that Q(w) is closest to P(z), set T(z) = w; (5) Apply T(z) to input image f1[r,c]: f2[r,c] = T(f1[r,c]) Example. 4.8-8. Suppose an 8-level image, (G=8); M=8, N=16 (M.N = no. of pixels = 128). Input histogram ph(), output q() - see Figure 4.8-4. z 0 1 2 3 4 5 6 7 ----------------------------------------- ph(z) 1 7 21 35 35 21 7 1 q(w) 16 16 16 16 16 16 16 16 Figure 4.8-4 Input and Output Histograms ---------------------------------------- (1) Form P(z), and Q(w), from p() and q() , see Figure 4.8-5: z 0 1 2 3 4 5 6 7 ----------------------------------------- P(z) 1 8 29 64 99 120 127 128 Q(w) 16 32 48 64 80 96 112 128 Figure 4.8-5 Input and Output cdfs ----------------------------------- (2) for each z in the input do: (2.1) choose w such that Q(w) is closest to P(z), T(z) = w; (i.e. set up lookup table/function) z=0 : must choose T(0) = 0 (closest) z=1 : T(1) = 0 z=2 : T(2) = 1 z=3 : T(3) = 3 z=4 : T(4) = 5 z=5 : T(5) = 6 z=6 : T(6) = 7 z=7 : T(7) = 7 Thus, resulting histogram: z 0 1 2 3 4 5 6 7 ----------------------------------------- p(z ) 1 7 21 35 35 21 7 1 q'(w) 8 21 0 35 0 35 21 8 Figure 4.8-4 Input and Output Histograms ---------------------------------------- Notice that q'(w) is only an approximation to q(w), in general, for discrete levels, this will be the case. (We could make the equalisation exact by modifying the algorithm to allocate the pixels at input grey level zi, not just to one output level wj, but put some in wj-1, and some in wj+1 - to produce an exact q(w) histogram. Rosenfeld and Kak, chap 6.2, do this. To me it doesn't make much sense to map one pixel value 3 (say) to 1, another to 2, and maybe another to 4. Furthermore, the lookup table algorithm would have to be made considerably more complex - for very little gain). Ex. 4.8-9. Suppose that a 64 x 64, 8-level image has the grey-level distribution shown in Figure 4.8-5. I have also included in the table p(z), the density calculated from the histogram. Calculate the transformation function, T(z), to give histogram equalisation, using the algorithm given above. Then calculate q'(w), the actual output histogram, and plot it. Recommendation: Proceed as follows. Work out q(w). The work out the cumulatives P(), and Q(); then apply the algorithm. Note: this example is taken from G&W *2nd ed.*, DON'T copy their answer, it's wrong! ------------------------------------------------------------ z ph(z) p(z)= q(w) P(z) Q(z) T(z) ph(z)/M.N ----------------------------------------------------------- 0 790 0.19 1 1023 0.25 2 850 0.21 3 656 0.16 4 329 0.08 5 245 0.06 6 122 0.03 7 81 0.02 ----------------------------------------------------------- Figure 4.8-5 Histogram for ex. 4.8-9 ------------------------------------ Ex. 4.8-10. Using the same input histogram as in Ex. 4.8-9, produce a transformation to yield the density shown, q(w) in Figure 4.8-6. Again, be wary of the G&W answer for this. ------------------------------------------------------------ z ph(z) p(z)= q(w) P(z) Q(z) T(z) ph(z)/M.N ----------------------------------------------------------- 0 790 0.19 0.00 1 1023 0.25 0.00 2 850 0.21 0.00 3 656 0.16 0.15 4 329 0.08 0.20 5 245 0.06 0.30 6 122 0.03 0.20 7 81 0.02 0.15 ----------------------------------------------------------- Figure 4.8-6 Densities for ex. 4.8-10 ------------------------------------ [Low], chap 5, [G&W] chap 4.2 give good examples of effectiveness of histogram modification. Ex. 4.8-11. Someone has asserted: "histogram modification (or any form of contrast stretching) is unlikely to be of assistance in a purely AUTOMATIC machine vision application (i.e. one in which there is no human intervention), since these techniques introduce no new information - in fact, they often destroy information" On the basis of Chapters 4.4 to 4.8, comment on this assertion. Hint: Look carefully at the histogram equalisation transformation function in Example 4.8-8. Compare the input and output histograms. How many grey levels in the input? How many in the output? 4.9 Local Enhancement. ---------------------- The point (grey level) methods in chaps 4.5 to 4.8 are applied GLOBALLY, i.e. to the whole image. But suppose that the contrast differs across the image, e.g. due to shadow, or parts of a TV camera sensing surface having lost sensitivity. Or, as in the example in [G&W] p.159, you have a global dynamic range that is greater than the display can handle, i.e. you would like to, adaptively, focus on a diiferent part of the grey scale, at different parts of the image. Thus, you need to allow the transformation to vary across the image. This can be done by defining a m x n neighbourhood. The neighbourhood is 'scanned' across the image pixel by pixel. It stops at each pixel, computes the transformation for the current m x n pixels, and applies that transformation to the centre pixel (only). Obviously, this is a lot of computing, which can be reduced by not overlapping the neighbourhoods; however, a checkerboard effect may result. See [G&W] for some results of local enhancement. 4.10 Noise Reduction by Averaging of Multiple Images. ---------------------------------------------------- Assume you have the possibility of obtaining many still, 'identical', images of a scene; but, the images are picking up noise in transmission, or from the sensor system. Then averaging together these images pixel by pixel: (4.10-1) fa(r,c) = (1/Na)sum(i=1..Na)fi(r,c) for r=rl..rh, c=cl..ch. where Na = number of images averaged, and fi(.,.) is the ith image. N.B. this is done for each pixel; we are NOT smoothing across pixels. Thus, we can talk about the mean, m(r,c), and variance v(r,c) of each pixel. If our model is that the only distortion is noise, then fi(r,c) can be written: (4.10-2) fi(r,c) = f(r,c) + ni(r,c) i.e. the ith image is the TRUE, noiseless image, plus the ith noise image. Most naturally occurring noise has the characteristic that it is random and uncorrelated. Often too, it is zero mean. Roughly speaking, these last two statements indicate that for every positive noise value, you will eventually get a negative one, and if you take enough values in the average you end up with zero. Example 4.10-1. Throw a dice many times; for each throw you get one of 1,2,..6; from each throw subtract 3.5 and add the number to a running sum (which was initialised with zero); as the number of throws gets larger, the sum approaches 0.0 more closely. It can be shown that if the noise level (e.g. as indicated by the standard deviation of the pixel values (at (r,c) ) ) is s1 for 1 image (no averaging), then it is (4.10-3) sn = s1 / sqrt (Na) for Na images averaged. You can easily experience two examples of this: (1) on a noisy stereo radio reception, switch the tuner to mono; this causes the system to add the left and right signals to produce one signal; the noise std. dev. reduces by 1/sqrt(2) = 1/1.414, i.e. reduces to 0.707 of what it was, (2) freeze frame on a videoplayer, see how noisy the image is compared to moving: the eye tends to average over a number of the 25 images painted on the screen per second. Ex. 4.10-2. On DataLab, generate a rectangle, (grect) add Gaussian noise to it (ggna) - observe the noiselike mottled effect; now add more noise (ggna) say 10 times; notice how the mottled effect decreases; the noise is cancelling itself out. Ex. 4.10-3. You are working with a noisy TV camera and a digitiser. You observe that the noise standard deviation, after the digitiser, is about 16 grey levels. You have detail in the image that requires better than 5 grey levels precision, to be sure of resolving it; thus, in single images, the detail is 'buried' in the noise. If possible, how many images should you average. Incidentally, image averaging does not directly smooth the image, nor sharpen, or otherwise spatially affect it. [Note: we may have time, towards the end of the course, to do some statistics, and so provide a more rigorous explanation of this stuff] SPATIAL OPERATIONS - SMOOTHING ------------------------------ 4.11 Neighbourhood Averaging ---------------------------- Suppose that small neighbourhoods (e.g. 3 x 3) of an image are known to exhibit very smooth variations in pixel values; so that, any abrupt variations have to be due to noise; of course, this is not always true. For this sort of image, if you average over the neighbourhood (window), you will average the noise, which (in the average) tends to zero. The window (or mask) can be any shape, but is usually square, and often 3 x 3. Eqn. 4.11-1 formally describes the process. (4.11-1) g(r,c) =(1/Nw) # f(i,j) (over window centred on (r,c) ) where g = output image f = input image. For a window which stretches -w to +w on either side of the central pixel, in the vertical dimension (rows) and -v to +v in the horizontal (columns) this becomes (see Figure 4.11-1): r+w c+v (4.11-2) g(r,c) = # # f(k,l).(1/Nw) k=r-w l=c-v where Nw = (2.w+1).(2.v+1) -v 0 +v -w +-----------------------------+ | 1/9 | 1/9 | 1/9 | |f(r-1,c-1|f(r-1,c) |f(r-1,c+1| | | | | +-----------------------------+ | 1/9 | 1/9 | 1/9 | 0 |f(r,c-1) |f(r,c) |f(r,c+1) | | | | | +-----------------------------+ | 1/9 | 1/9 | 1/9 | |f(r+1,c-1|f(r+1,c) |f(r+1,c+1| | | | | +w +-----------------------------+ Figure 4.11-1 Window (w=1, v=1, 3x3) centred on pixel (r,c) ---------------------------------------------------------- Eqn. 4.11-2 allows any window whose dimensions are odd numbers (2.w+1, 2.v+1) - so that the window extends symmetrically on each side of the central pixel. The factor 1/Nw (= 1/9 for 3x3), is the so-called WEIGHT of the WINDOW or MASK. For averaging, all weights are equal. The formal style of eqn. 4.11-2 will allow us extend the notion of windows to perform any function, (smooth, sharpen, filter); generally this is called CONVOLUTION. Equivalent terms are: window, mask, template, convolution kernel, kernel, weighting function; also, see later, impulse response. The major problem with averaging is that real spikes and discontinuities are smoothed as well, not just the noise induced ones. Thus edges and points are blurred. If you start off with a resolution of 10m x 10m (say), and use a 3x3 window, you effectively reduce the resolution to 30m x 30m. 4.12 Lowpass Filtering ---------------------- If you use a window like one of those in Figures 4.12-1(a) or (b), the process is given the general term LOWPASS FILTERING; it is called this because you are filtering out high frequency components from the image (for more about frequency, filtering etc. see chapter 3). Sometimes these windows produce better results than straight averaging. ------------------------------------------------------------ 1/10 1/10 1/10 1/16 2/16 1/16 1/9 1/9 1/9 1/10 2/10 1/10 2/16 4/16 2/16 1/9 1/9 1/9 1/10 1/10 1/10 1/16 2/16 1/16 1/9 1/9 1/9 (a) (b) (c) 1/5 1/6 1/5 1/5 1/5 1/6 1/3 1/6 1/5 1/6 (d) (e) Figure 4.12-1 Lowpass Filtering Windows. --------------------------------------- In Figure 4.12-1 (c) is averaging, (d) and (e) are so-called 'plus shaped' windows, i.e. zero at corners. Convolution. ------------ In the case of filtering, the more general form of eqn 4.11-2 must be given, this is CONVOLUTION: r+w c+v (4.12-1) g(r,c) = # # f(k,l).h(r-k,c-l) k=r-w l=c-v where h(.) are the weights. In shorthand this can be written: (4.12-1 (a) ) g = f (*) h , where '(*)' denotes convolution. The arrangement for convolution is shown in Figure 4.12-2 -v 0 +v -w +-----------------------------+ | h(1,1) | h(1,0) | h(1,-1) | |f(r-1,c-1|f(r-1,c) |f(r-1,c+1| | | | | +-----------------------------+ | h(0,1) | h(0,0) | h(0,-1) | 0 |f(r,c-1) |f(r,c) |f(r,c+1) | | | | | +-----------------------------+ | h(-1,1) | h(-1,0) | h(-1,-1)| |f(r+1,c-1|f(r+1,c) |f(r+1,c+1| | | | | +w +-----------------------------+ Figure 4.12-2 Convolution Window Centred on Pixel (r,c) ------------------------------------------------------- -v 0 +v -w +-----------------------------+ | h(-1,-1)| h(-1,0) | h(-1,+1)| | | | | +-----------------------------+ 0 | h(0,-1) | h(0,0) | h(0,+1) | | | | | +-----------------------------+ | h(+1,1) | h(+1,0) | h(+1,+1)| | | | | +w +-----------------------------+ Figure 4.12-3 Convolution Window - itself. ----------------------------------------- The observant will notice that, in eqn. 4.12-1, see Figures 4.12-2 and 4.12-3, that the window is placed as a mirror image of itself - rotated by 180 degrees; of course, usually it doesn't really matter, if the window is symmetric. Again, we adopt this formal style because it helps unify all mask/window based operations into one single concept: CONVOLUTION. Much more about convolution later. The function h(i,j) in eqn. 4.12-1 is called the IMPULSE RESPONSE of the filter. That is, if you have an image with a single pixel, of value 1 (an impulse), at (r0,c0), and zeros elsewhere, then applying eqn. 4.12-1 will produce an output image with the values of h(.,.) centred around (r0,c0) - and zero everywhere else. Ex. 4.12-1. Prove the previous assertion by applying each of the filters in Figure 4.12-1, using eqn 4.12-1, to the 'impulse' image given in Figure 4.12-2. You should set the values around the edge of the output image to 0. 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.12-2 Image (5x5) with impulse at (r=2,c=2) --------------------------------------------------- Ex. 4.12-2. Given an input image f[rl..rh][cl..ch], an output image g[rl..rh][cl..ch], a filter window (impulse response) h[-w..+w][-v..+v], write a computer program fragment to perform eqn. 4.12-1. Hint 1: there will be FOUR nested loops. Hint 2: Start off with: for r:= rl+w to rh-w do for c:=... sum:=0.0 Test the fragment by dry running it to do Ex. 4.12-1. Ex. 4.12-3. You want to save memory. You decide to use only one image , f(), and store the results of filtering back in f(), i.e. f() is input (source) and output (destination). (a) Explain why this will not work. (b) For what sort of operations will this work OK.? (c) try out on DataLab (e.g. functions esobx, esm). Ex. 4.12-4. Adapt the algorithm of Ex. 4.12-2 to work in a programming language that insists that all array indices start at 0 (or 1). 4.13 Median Filtering --------------------- As mentioned in chapter 4.11, a problem with averaging is its tendency to blur edges. MEDIAN FILTERING is a method which largely avoids this problem. In addition median filtering is better suited to removing 'impulse' noise, i.e. noise characterised by large irregular spikes, or so-called 'salt-and-pepper' noise. In an audio signal this is the type of noise that comes from a scratch on a record, rather than the more normal 'hiss'. Median filtering works by taking all the values in a neighbourhood, sorting them according to magnitude, and taking the middle ranked value (median) as the output value. Example. 4.13-1. A one dimensional example. Uses a 3 wide window. Input values: 0 0 6 0 0 0 12 0 0 0 15 15 15 15 33 15 15 15 0 0 0 0 0 If you apply a 3 wide averaging window to this you get 0 2 2 2 0 4 4 4 0 5 10 15 15 21 21 21 15 10 5 0 0 0 0 (1) The noise 'spikes' (6,12,33) are smoothed - rather smeared - but not removed. (2) Step edges (0 0 0 15 15 15 ) are blurred (to 0 5 10 15 15). Application of a 3 sample wide median filter produces: (original first) 0 0 6 0 0 0 12 0 0 0 15 15 15 15 33 15 15 15 0 0 0 0 0 (median filtered) 0 0 0 0 0 0 0 0 0 0 15 15 15 15 15 15 15 15 0 0 0 0 0 Thus the median filter completely removes the spikes, but the edges are preserved. See the texts, especially [Low] p. 76, [G&W] for good examples. As with the weighted filtering windows, the median filter window can be any shape or size; but is often square and 3 x 3. Median filtering is a non-linear filter - we have covered the distinction linear vs. non-linear in chapter 3. Ex. 4.13-1. (a) Given an input image f[rl..rh][cl..ch], write a computer program fragment to perform a 3 x 3 median filter - output image g[rl..rh][cl..ch]. (See ex. 4.12-2, you should probably adapt the results of that). As before there will be, at least FOUR nested loops, starting off with: for r:= rl+w to rh-w do for c:=... If you wish, you may assume you have a function sort(x[] : integer, var, n : integer) which sorts an array of integers: x[0] = largest, x[n-1] smallest. n = size of array. (c) Test the fragment by dry running it on Ex. 4.13-2. (d) How could you speed up the algorithm? Hint: not all of the array (to be sorted) changes as the window is swept over the image. Ex. 4.13-2 Perform 3x3 median filtering on the image given in Figure 4.13-1. Draw a picture of the result. What happens at corners? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 9 0 0 6 0 0 0 0 9 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 9 1 1 1 9 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 9 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 9 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 9 9 0 1 1 1 1 0 0 0 9 0 0 0 0 0 0 0 0 1 1 9 1 0 0 0 0 0 0 0 0 0 9 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.13-1 Letter 'T' with impulse noise. -------------------------------------------- Ex. 4.13-3. Perform 3x3 averaging on Figure 4.13-1. Compare result with that obtained by median filtering. Which would be best if you wanted to recognise the character? Ex. 4.13-4. A so-called 'separable median filter' is performed by sweeping a 1x3 median filter along each of the rows, followed by a 3x1 median filter down each of the columns. (a) Apply this to Figure 4.13-1. (b) Compare the result with that for Ex. 4.13-2. Ex. 4.13-5. The separable median filter should be much faster; make a brief comparison of the likely speed performance of separable, and square median filters. Another example: Figures 4.13-2 (a-1) to (d-2) show the following sequence of images: (a) 16 x 16 image with rectangle (b) Rectangle with random spots added. (c) Result of 3x3 smoothing (d) Result of 3x3 median filter. In each case the corresponding data are printed. 0123456789012345 ---------------- 0| | 1| | 2| | 3| | 4| MMMMMMMMM | 5| MMMMMMMMM | 6| MMMMMMMMM | 7| MMMMMMMMM | 8| MMMMMMMMM | 9| MMMMMMMMM | 0| MMMMMMMMM | 1| MMMMMMMMM | 2| MMMMMMMMM | 3| | 4| | 5| | +----------------+ +0123456789012345 DataLab I-00:00 24/01/92 14:16 Figure 4.13-2 (a-1) 16 x 16 image with rectangle ------------------------------------------------ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.13-2 (a-2) Data for (a-1) ---------------------------------- 0123456789012345 ---------------- 0| / | 1| / | 2| / / | 3| / / / / | 4| ////////// | 5| //M////// | 6| ///////// | 7| /////M/// | 8| ///////// | 9| ///////// / | 0| ///////// | 1| ///M///// / | 2|/ ///////// | 3| | 4| / | 5| | +----------------+ +0123456789012345 DataLab I-02:00 24/01/92 14:16 Figure 4.13-2 (b-1) Recatngle image with random spots added. ------------------------------------------------------------ 0 0 0 0 0 0 0 0 0 0 0 0 99 0 0 0 0 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 0 99 0 0 0 0 0 0 99 0 0 99 0 99 0 0 0 99 0 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 99 99 198 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 198 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 99 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 198 99 99 99 99 99 0 99 0 99 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.13-2 (b-2) Data for (b-1) ---------------------------------- 0123456789012345 ---------------- 0| ... | 1| ... | 2|... . . ,.. | 3| .,-/-/-=//. | 4| ,/XBX+=++=, | 5| ,=MMMBBBB+- | 6| ,=MMMMMMB=, | 7| ,=BBBMMMB=, | 8| ,=BBBMMMB=- | 9| ,=BBBBBBB=- | 0| ,=BMMMBBB=/..| 1| ,=BMMMBBB=- | 2| .-=+++===-, | 3| .,-,,,,,,. | 4| | 5| | +----------------+ +0123456789012345 DataLab I-03:00 24/01/92 14:16 Figure 4.13-2 (c-1) Smoothed (b-1) 3x3 window --------------------------------------------- 10 10 10 0 0 0 0 0 10 10 21 21 21 10 0 0 10 10 10 0 0 0 0 0 10 10 21 21 21 10 0 0 21 21 21 10 10 10 21 10 21 10 32 21 21 0 0 0 10 10 10 21 32 43 54 43 54 43 65 54 54 21 10 10 10 10 10 32 54 87 98 87 76 65 76 76 65 32 10 10 0 0 0 32 65 109 109 109 98 98 98 98 76 43 10 10 0 0 0 32 65 109 109 109 109 109 109 98 65 32 0 0 0 0 0 32 65 98 98 98 109 109 109 98 65 32 0 0 0 0 0 32 65 98 98 98 109 109 109 98 65 43 10 10 0 0 0 32 65 98 98 98 98 98 98 98 65 43 10 10 0 0 0 32 65 98 109 109 109 98 98 98 65 54 21 21 10 10 0 32 65 98 109 109 109 98 98 98 65 43 10 10 10 10 0 21 43 65 76 76 76 65 65 65 43 32 10 10 10 10 0 21 32 43 32 32 32 32 32 32 21 10 0 0 0 0 0 10 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 10 10 10 0 0 0 0 0 0 0 0 0 0 Note smudging of edges. Figure 4.13-2(c-2) Data for (c-1) --------------------------------- 0123456789012345 ---------------- 0| | 1| | 2| | 3| M M MMM | 4| MMMMMMMMM | 5| MMMMMMMMM | 6| MMMMMMMMM | 7| MMMMMMMMM | 8| MMMMMMMMM | 9| MMMMMMMMM | 0| MMMMMMMMMM | 1| MMMMMMMMM | 2| MMMMMMM | 3| | 4| | 5| | +----------------+ +0123456789012345 DataLab I-04:00 24/01/92 14:16 Figure 4.13-2(d-1) Figure (b-1) Median filtered. ------------------------------------------------ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 0 99 0 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 99 99 99 99 99 99 99 99 99 0 0 0 0 0 0 0 0 99 99 99 99 99 99 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.13-2(d-2) Data for (d-1) --------------------------------- 4.14 Other Non-linear Smoothing ------------------------------- 4.14.1 Mode. ------------ The output value is the most common in the neighbourhood. Gives similar results to median in many cases. Very useful if the image is made up of labels (e.g. of landuse), i.e. after classification or segmentation. In that case you are taking a vote as to the most common label in the neighbourhood; certainly in that case averaging or median would make no sense. Why? 4.14.2 k-Nearest Neighbour (kNN) -------------------------------- Set g(r,c) to the average of the k pixels, in n(r,c) (the neighbourhood, centred on (r,c)) whose values are closest to f(r,c). Typically use k=6 for a 3x3 neighbourhood. Preserves edges. 4.14.3 Sigma Filter ------------------- Set g(r,c) to the average of all input pixels, in n(r,c), whose values are within T of f(r,c). T is a parameter. Called sigma (greek letter, the symbol for standard deviation) because T may be based on the std. dev. Similar performance to kNN. 4.14.4 Out-of-range filtering ----------------------------- Compute average in the neighbourhood (n(r,c)) of (r,c); average =fa(r,c). If the difference between f(r,c) and fa(r,c) is greater than some threshold, set the output g(r,c) to the average, otherwise set it to f(r,c) (i.e. leave it the same). I.e. If |f(r,c) - fa(r,c)| > Thresh then g(r,c) = fa(r,c) else g(r,c) = f(r,c) Ex. 4.14.1. Repeat Ex. 4.13-1 for the out-of-range filter. Ex. 4.14.2 Repeat Ex. 4.13-2 for the out-of-range filter. 4.14.5 Closest-of-minimum-and-maximum. -------------------------------------- Compute the min. and max. values in n(r,c). Set g(r,c) to the one that is closest to f(r,c). Useful for sharpening boundaries. Usually iterated. Will leave isolated spikes - these can be removed by median filter. 4.14.6 Min-Max. --------------- Compute minimum (or maximum) value in n(r,c) and set g(r,c) to that. Useful for shrinking or expanding 1s in a binary picture. More on this when we talk about segmentation. 4.15 Image Sharpening - General ------------------------------- Often called 'edge detection' - but, strictly, we need thresholding to follow the sharpening to qualify for this name. Figure 4.15-1 shows a full edge detection system, based on gradient. f(x,y)---->Grad---->|.|---->Thresh---->Edge----->Edge ient old Thinning |.| denotes absolute value Figure 4.15-1 Edge Detection System. ------------------------------------ Most edge sharpening processes will also sharpen spikes. In contrast with smoothing operations, which are usually lowpass filters, sharpening operations are associated with highpass filtering. Noise is usually high frequency, so sharpening will often increase noise effects. The generalised windowing / masking introduced in chapter 4.12 is easily applied to sharpening - you just change the weights. SPATIAL OPERATIONS - SHARPENING ------------------------------- 4.16 Gradient Based Edge Enhancement. ------------------------------------- 4.16.1 Introduction ------------------- The first part of this section introduces differentiation (of continuous functions), indicates how gradient and 'edginess' are related, then shows how to construct windows that do the discrete (digital) version of differentiation. The use of gradient is based on the reasoning that edges are characterised by large slopes in the image function f(x,y) (view f(x,y) as forming the height coordinate on the x-y plane). 4.16.2 Gradient, Slope and Differentiation ------------------------------------------ Figure 4.16-1 illustrates gradient of a one-dimensional function f(.) of variable x, f(x). NOTE: In what follows, I use 'D' instead of Greek capital delta (triangle - which is used for difference), and 'del' for Greek lower case delta (the curly one - which is used for partial differential). f(x) | / | / (3) | / | ____ (2) f(x2)12| / | /| f(x1) 6| /__| Df (1) | / Dx 0 |/__________________________ x 0 x1 x2 =2 =4 Figure 4.16-1 Illustration of 1-D gradient. ------------------------------------------- The gradient (or slope) at a point (x) is determined by dividing the increase in f, Df = f(x2) - f(x1), (in the example) by the increase in the argument, x, Dx = x2 - x1 , (in the example) (4.16-1) gradient(f(x)) = Df/Dx at halfway between x1, x2 In the limit, as x2-x1 approaches 0, (x2 -> x1 ), you end up with: (4.16-2) df/dx = lim (Df/Dx) Dx->0 df/dx is DIFFERENTITION. In Figure 4.16-1, df/dx is about 1.0 at (1), a bit less at (3) and 0.0 at (2); thus df/dx measures slope or gradient. 4.16.3 Discrete Differentiation - Differences --------------------------------------------- If we now go discrete, as in Figure 4.16-2, we change from continuous variable x to discrete variable i; f(x) becomes a sequence of numbers, f(i): ------------------------------------------------- i 0 1 2 3 4 5 6 7... 12 f(i) 0 3 6 9 12 15 18 18 27 ------------------------------------------------- Figure 4.16-2 Discrete Function ------------------------------- f(i) is shown graphically in Figure 4.16-3. f(i) 27| * | * | * | * * * * f(4) 15| * f(3) 12| * 9| * 3| * 0|*__________________________ i 0 1 2 3 4 5 6 7 8 9 ...12 Figure 4.16-3 Graph of Discrete Function. ----------------------------------------- The discrete version of differentiation is DIFFERENCING. If you have a sampled version of an analogue function, then the difference gives the best available approximation of differential. At any point i, the difference of f(i) is (4.16-3(a)) Df(i) = f(i) - f(i-1) If the sampling period (Dx) is 1 then eqn 4.16-3 is a direct replacement for the differential. (4.16-3(a)) gives an output value centred on (i+1/2); to make it symmetric you can use: (4.16-3(b)) Df(i) = f(i+1) - f(i-1) Ex. 4.16-1. In Figure 4.16-1 assume that, in region (1), f(x) = 3x. (a) What is df/dx in this region? Figure 4.16-3 shows a sampled (discrete) version of f(x); sampling period Dx = 1. (b) What is Df(i) at i=3? (eqn 4.16-3). (c) Compare the results of (a), and (b). (d) What are TWO circumstances in which the results will differ? 4.16.4 Second Differences ------------------------- If you differentiate df/dx, you get d2f/dx2. In terms of differences, assume you have the difference sequence: Df(0) Df(1) ...Df(i-1) Df(i) ...Df(n-1) Taking differences on this (equivalent to d2f/dx2 in continuous functions) yields (just apply eqn 4.16-4 again): (4.16-4) D2f(i) = Df(i) - Df(i-1) From eqn. 4.16-4, Df(i) = f(i) - f(i-1) similarly, Df(i-1) = f(i-1) - f(i-2) Substituting gives: (4.16-5) D2f(i) = f(i-2) - 2.f(i-1) + f(i) This is offset by 1 sample, so a better (symmetric) approximation of d2f/dx2 is: (4.16-6) D2f(i) = f(i-1) - 2.f(i) + f(i+1) Usually, the signs are negated. 4.16.5 Differentiation in 2-D. - Partial Differentials ------------------------------------------------------ Moving to two dimensions introduces a very minor addition. Here we now have TWO gradients: delf(x,y)/delx in the x direction, and delf(x,y)/dely in the y direction, the so-called partial differentials. delf(x,y)/delx is f(x,y) differentiated with respect to x, with y held constant, i.e. (4.16-7) fx(x,y) = delf(x,y)/delx = df(x,y = y1)/dx at x, y=y1 (called fx(x,y) - shorthand) Similarly delf(x,y)/dely (= fy(x,y)). A similar del2f(x,y)/delx2 exists, fxx(x,y). If you want the magnitude of the complete gradient, you use Pythagoras' theorem to add them vectorially: (4.16-8) G(f(x,y)) = sqrt(fx()**2 + fy()**2) Sometimes the quicker addition: (4.16-9) G(f(x,y)) = |fx()| + |fy()| is used. If you convert eqn. 4.16-9 to discrete, you get: (4.16-10) G(f(r,c)) = |fr(r,c)| + |fc(r,c)| here fr(), are now differences (see (4.16-3) ): (4.16-11) (a) fr(r,c) = f(r,c) - f(r-1,c), and (b) fc(r,c) = f(r,c) - f(r,c-1) These can be computed by simple differencing along the rows, then the columns, and adding the absolute values of the results. 4.16.6 Windows for Differentiation. ----------------------------------- In terms of windows, (4.16-11) suggests, in the notation of (4.12-1(a)): (4.16-12(a)) fr(r,c) = f (*) hr (b) fc(r,c) = f (*) hc where hr is a 2x1 window, (4.16-13(a)) hr = -1 +1 and hc is a 1x2 window: (4.16-13(b)) hc = -1 +1 If you base the edge enhancement of the symmetric difference of (4.16-3(b)) you have hr (3x1) and hc (1x3): (4.16-14(a)) hr= -1 0 +1 (4.16-14(b)) hc= -1 0 +1 4.16.7 Other Gradient Windows ----------------------------- Prewitt Operators: ------------------ (4.16.7-1) hr = -1 0 +1 hc = -1 -1 -1 -1 0 +1 0 0 0 -1 0 +1 +1 +1 +1 (vertical edges) (horiz. edges) Sobel Operators: ---------------- More contribution from central pixel, compared to Prewitt. (4.16.7-2) hr = -1 0 +1 hc = -1 -2 -1 -2 0 +2 0 0 0 -1 0 +1 +1 +2 +1 (vertical edges) (horiz. edges) Roberts Operators: ------------------ Not vertically, horizontally aligned. (4.16.7-3) h1 = 1 0 h2 = 0 1 0 -1 -1 0 4.16.8 Gradient Magnitude and Direction --------------------------------------- Often, you are not interested in separating vertical and horizontal edges, you just want the total 'edginess', see section 4.16. For magnitude you apply eqn. 4.16.8-1: (4.16.8-1) Gm(f(r,c)) = |Gx| + |Gy| = |f(r,c)(*)hr| + |f(r,c)(*)hc| (this is simply eqn. 4.16-10 rephrased) '(*)' denotes convolution. or, (4.16.8-2) Gm(f(r,c))= sqrt[Gr**2 + Gc**2] Likewise, you can add the Roberts gradients. If you want direction: (4.16.8-3) Ga(f(r,c)) = arctan(Gr/Gc) (N.B. as ever, r corresponds to y, c corresponds to x, in the notation of some texts) Direction is sometimes useful, if you are trying to trace the path of a continuous edge, based on an edge enhanced image. Ex. 4.16.8-1 Apply a Sobel vertical gradient operator to the image given in Figure 4.16.8-1. Draw a picture of the result. What happens at corners? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.18.8-1 Letter 'T'. --------------------------- Ex. 4.16.8-2 Apply a Sobel horizontal gradient operator to the image given in Figure 4.16.8-1. Draw a picture of the result. Ex. 4.16.8-3 Add the results (absolute values) of Ex. 4.16.8-1, and -2. Draw a picture of the result. Ex. 4.16.8-4. Given an input image f[rl..rh][cl..ch], an output image g[rl..rh][cl..ch], write a computer program fragment to apply the Sobel vertical operator. Hint: Look at your answer for Ex. 4.12-2. Another example: Figures 4.18.8-2 (a-1) to (d-2) show the following sequence of images: (a) 16 x 16 image with rectangle (b) Sobel vertical gradient of (a) (c) Sobel horiz. gradient of (a) (d) Overall gradient (b) + (c) In each case the corresponding data are printed. 0123456789012345 ---------------- 0| | 1| | 2| | 3| | 4| MMMMMMMMM | 5| MMMMMMMMM | 6| MMMMMMMMM | 7| MMMMMMMMM | 8| MMMMMMMMM | 9| MMMMMMMMM | 0| MMMMMMMMM | 1| MMMMMMMMM | 2| MMMMMMMMM | 3| | 4| | 5| | +----------------+ +0123456789012345 DataLab I-00:00 24/01/92 14:16 (see Figure 4.13-2 for data) Figure 4.18.8-2 (a-1) 16 x 16 image with rectangle -------------------------------------------------- 0123456789012345 ---------------- 0| | 1| | 2| | 3| +,MMMMMMM,+ | 4| +,MMMMMMM,+ | 5| | 6| | 7| | 8| | 9| | 0| | 1| | 2| +,MMMMMMM,+ | 3| +,MMMMMMM,+ | 4| | 5| | +----------------+ +0123456789012345 DataLab I-06:00 24/01/92 14:25 Figure 4.18.8-2(b-1) Sobel vertical gradient -------------------------------------------- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 297 396 396 396 396 396 396 396 297 99 0 0 0 0 0 99 297 396 396 396 396 396 396 396 297 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 99 297 396 396 396 396 396 396 396 297 99 0 0 0 0 0 99 297 396 396 396 396 396 396 396 297 99 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.18.8-2(b-2) Data for (b-1) ----------------------------------- 0123456789012345 ---------------- 0| | 1| | 2| | 3| ++ ++ | 4| ,, ,, | 5| MM MM | 6| MM MM | 7| MM MM | 8| MM MM | 9| MM MM | 0| MM MM | 1| MM MM | 2| ,, ,, | 3| ++ ++ | 4| | 5| | +----------------+ +0123456789012345 DataLab I-07:00 24/01/92 14:24 Figure 4.18.8-2(c) Sobel Horizontal gradient. --------------------------------------------- 0123456789012345 ---------------- 0| | 1| | 2| | 3| M+++++++++M | 4| +-+++++++-+ | 5| ++ ++ | 6| ++ ++ | 7| ++ ++ | 8| ++ ++ | 9| ++ ++ | 0| ++ ++ | 1| ++ ++ | 2| +-+++++++-+ | 3| M+++++++++M | 4| | 5| | +----------------+ +0123456789012345 DataLab I-08:00 24/01/92 14:25 Figure 4.18.8-2(d-1) Sobel gradient - overall. --------------------------------------------- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 198 396 396 396 396 396 396 396 396 396 198 0 0 0 0 0 396 594 396 396 396 396 396 396 396 594 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 396 0 0 0 0 0 0 0 396 396 0 0 0 0 0 396 594 396 396 396 396 396 396 396 594 396 0 0 0 0 0 198 396 396 396 396 396 396 396 396 396 198 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Figure 4.18.8-2(d-2) Data for (d-1) ----------------------------------- 4.17 Laplacian -------------- As mentioned above, edge enhancement is based on the highlighting of regions where the values are changing rapidly. Usually, we say we have an edge when the gradient is greater than a threshold. An alternative is to look for points where the gradient reaches a local extremum, i.e. where the second differential has a zero crossing. That is compute the second differential and mark as edge those points where it changes sign. The Laplscian gives a method of computing the discrete equivalent of the second differential. If you extend second differences (see eqn 4.16-6) to two-dimensions you get the following plus-shaped window: (4.17-1) hlap4 = -1 -1 4 -1 -1 This called the LAPLACIAN, after Laplace's differential equation: del2f/delx2 +del2f/dely2 = 0 even though it is really the negative of the Laplacian. The following square windows also approximate the Laplacian: (4.17-2) hlap8 = -1 -1 -1 -1 8 -1 -1 -1 -1 (4.17-3) hlap8 = 1 -2 1 -2 4 -2 1 -2 1 If you need direction (or separate x-, and y-components), you can use; (4.17-4) hlapr = -1 hlapc = -1 2 -1 2 -1 The Laplacian is sometimes called 'grad squared' - where 'grad' is written as capital delta upside down. We will write it 'gradsq'. The Laplacian tends to go a bit crazy on noisy images (i.e. noise is high frequency - Laplacian enhances high frequencies), so some sort of noise reduction - prior to applying Laplacian - is advisable. [Lim] p. 488 gives a method of eliminating some of the noise susceptability from the edge detection process: (1) Compute the Laplacian, (2) Compute the local variance (within a neighbourhood), (3) If Laplacian crosses zero: then: potential edge point, (4) If potential edge point AND local variance above a threshold, THEN: definite edge point. Ex. 4.17-1 Apply a Laplacian (hlap4) to Figure 4.13-1. Ex. 4.17-2 Smooth Figure 4.13-1, then apply the Laplacian, then detect zero crossings. 4.18 Edge Detection by Template Matching ---------------------------------------- Edge enhancement may also be approached by inserting in h templates of edges, e.g. eqn. 4.18-1 which enhances grey level variations along the 'North-West, South-East' diagonal, and eqn. 4.18-2 which enhances vertical edges (this is one of the Prewitt operators). 0 +2 +1 (4.18-1) A = +2 0 -2 -1 -2 0 1 0 -1 (4.18-2) A = 1 0 -1 1 0 -1 4.19 Highpass Filtering ----------------------- Smoothing operations are associated with lowpass filters, sharpening operations are associated with highpass filtering. Thus, more general highpass filtering operations (later lectures) may give alternative edge enhancement methods. 4.20 Marr-Hildreth Operators. ---------------------------- see [Lim] p. 488, [Niblack] p.86. These are nicknamed 'Mexican hat' operators - because of their shape; also called 'difference-of-Gaussian' (DOG). Based on an initial smoothing with a Gaussian lowpass filter window, to remove noise, and spurious edges: (4.20-1) h(r,c) = exp(-(r**2+c**2)/2.PI.s**2) where PI = 3.14159... s is a parameter, which gives the width of the smoothing window Then apply the Laplacian, followed by detection of zero crossings. The smoothing, Laplacian processes can be combined by taking the (continuous) Laplacian of the Gaussian function: (4.20-2) gradsq h(r,c) = exp(-(r**2+c**2)/2.PI.s**2).(r**2+y**2-2.PI.s**2)/(PI*s**2)**2 Ex. 4.20-1 For s= 1.0 calculate eqn. 4.20-1 for: (a) c=0, r= 0 to 10 in steps of 1. (b) r=0, c=0 to 10 in steps of 1. (c) Plot results on a graph. Ex. 4.20-2 Repeat Ex. 4.20-1 for the Marr-Hildreth operator, eqn 4.20-2. 4.21 Additional Exercises. -------------------------- Ex. 4.21-1. A monochrome TV camera (followed by a digitiser etc.) is monitoring parts passing along a conveyer belt. This time, the illumination is even, but unfortunately, the image is 'noisy'. If you continuously image a grey card (of constant grey level across the field of view, you get an estimate of 7 for the standard deviation of a pixel value (i.e. by estimating the std. dev. for a sequence of values for the same pixel). Suggest a method by which you can reduce the standard deviation to a level of 1. Assume that you can halt the conveyer belt, under control of the image acquisition computer. Ex. 4.21-2. A monochrome CCD TV camera (followed by a digitiser etc.) is monitoring parts passing along a conveyer belt. The CCD has 512 x 512 cells. Unfortunately, each cell has a bias (different for each); e.g. for a white card scene, cell (100,101) might give a value of 200, cell (50,29) might give 210, etc. Suggest how you would estimate the bias 'image'. How would you apply a correction. Ex. 4.21-3. A monochrome TV camera is monitoring tomatoes passing along a conveyer belt. Green tomatoes are frowned upon by the supermarket buyers. Suggest a technique by which green tomatoes may be highlighted. (Assume you can have the luxury of a separate camera for green tomato detection). Ex. 4.21-4. A 7 x 7 image is shown in Figure 4.21-1. (a) Show that the Sobel gradient operator, applied to Figure 4.21-1 will yield Figure 4.21-2, |G(r,c|; the image boundary pixels are not shown in Figure 4.21-2 - because you cannot apply the Sobel operator at these points. (b) If you choose a threshold of 100, what are the candidate edge points? (c) It is neccessary to perform edge thinning on the result of (b), i.e. to wide strips of edges. Suppose we decide that any point among the candidate edge points is a true edge point if it is a local maximum of |G| in either horizontal, or vertical directions. On this basis, determine edge points. 60 60 62 65 68 70 70 60 60 62 65 68 70 70 70 70 72 75 78 80 80 100 100 102 105 108 110 110 130 130 132 135 138 140 140 140 140 142 145 148 150 150 140 140 142 145 148 150 150 Figure 4.21-1 -------------- 40.8 44.7 46.6 44 7 40.8 160.2 161.2 161.8 161.2 160.2 240.1 240.8 241.2 240.8 240.1 160.2 161.2 161.8 161.2 160.2 40.8 44.7 46.6 44 7 40.8 Figure 4.21-2 ------------- Ex. 4.21-5. The input grey scale is[0,30]; (a) give the algorithm (three transformations) to compress the grey scale by a factor of 2 in the ranges [0,10] and [20,30], while stretching it by a factor of 2 in [10,20], (b) draw a graph of this transformation; output on vertical axis, input on horizontal (see Figure 4.3-1). 4.22 ANSWERS to SELECTED QUESTIONS. ---------------------------------- Ex. 4.21-5. The input grey scale is[0,30]; (a) give the algorithm (three transformations) to compress the grey scale by a factor of 2 in the ranges [0,10] and [20,30], while stretching it by a factor of 2 in [10,20], (b) draw a graph of this transformation; output on vertical axis, input on horizontal (see Figure 4.3-1). ANS: This is slightly tricky, in that the question is not fully specific; it should have been added that the transformation should be continuous - although that would normally be assumed anyway. Continuous answer. (i) [0..10] -> [0..5] (ii) [10..20] -> [5..25] (iii)[20..30] -> [25..30] if(betw(z,0,10))z':=(z-0)/2 + 0 if(betw(z,10,20))z':=(z-10)*2 + 5 if(betw(z,20,30))z':=(z-20)/2 + 25 where betw(z,z0,z1) = if(z>=z1 and z<=z2) Here I have made explicit the shift to zero BEFORE the scale, followed by a shift back (to the appropriate start point - the end point of the previous transformation). A non-continuous answer would be: (i) [0..10] -> [0..5] (ii) [10..20] -> [20..40] (iii)[20..30] -> [10..20] though this does not make too much sense. ------------------------------------------------------------- Ex. 4.7-2. Input image: 2 2 2 2 2 4 4 4 4 4 6 6 6 6 6 8 8 8 8 8 10 10 10 10 10 (a) What is the mean, m? What is the standard deviation, s? (b) Transform to mean, m'= 128, std.dev s' = 80. (c) Will it fit into [0,255]? (d) If not, 'clip' to 0 below 0 and to 255 above 255. ANS: m = (1/25)(sum) = (1/25)(2+2+2...+10+10) = (1/25)(10+20+30+40+50) = 6 = 6 v = (1/25)((2-6)**2....) = (1/25)(5*16+5*4+5*4+5*16) = 200/25 = 8 s = sqrt(v) = 2.8 Formula: z' = (z-m) x s'/s + m' = (z-6) x 80/2.8 + 128 at 2 z' = -4 x 80/2.8... +128 = 13.7 etc. ... at 10 z' = 4x80/2.8 +128 = 242. i.e. the range is reasonably filled. You can see that the purpose of such a transformation is to shift the data into the middle of the range, and expand the range. Ex. 4.8-9. Suppose that a 64 x 64, 8-level image has the grey-level distribution shown in Figure 4.8-5. I have also included in the table p(z), the density calculated from the histogram. Calculate the transformation function, T(z), to give histogram equalisation, using the algorithm given above. Then calculate q'(w), the actual output histogram, and plot it. Recommendation: Proceed as follows. Work out q(w). Then work out the cumulatives P(), and Q(); then apply the algorithm. Note: this example is taken from G&W *2nd ed.*, DON'T copy their answer, it's wrong! ------------------------------------------------------------ z ph(z) p(z)= q(w) P(z) Q(z) T(z) ph(z)/M.N ----------------------------------------------------------- 0 790 0.19 .125 .19 .125 1 1 1023 0.25 .125 .44 .25 3 2 850 0.21 .125 .65 .375 4 3 656 0.16 .125 .81 .5 5 4 329 0.08 .125 .89 .625 6 5 245 0.06 .125 .95 .75 7 6 122 0.03 .125 .98 .875 7 7 81 0.02 .125 1.0 1.0 7 ----------------------------------------------------------- Figure 4.8-5 Histogram for ex. 4.8-9 ------------------------------------ *******ANS see above. Note, you can work with histograms or with 'probabilities', so long as p() and q() are BOTH histograms or BOTH probabilities. q() = .125 is merely dividing the total probability (1.0) into 8 for 8 slots. ------------------------------------------------------------------ Ex. 4.8-10. Someone has asserted: "histogram modification (or any form of contrast stretching) is unlikely to be of assistance in a purely AUTOMATIC machine vision application (i.e. one in which there is no human intervention), since these techniques introduce no new information - in fact, they often destroy information" On the basis of Chapters 4.4 to 4.8, comment on this assertion. Hint: Look carefully at the histogram equalisation transformation function in Example 4.8-8. Compare the input and output histograms. How many grey levels in the input? How many in the output? ANS: (i) If you look at the transformation for Ex. 4.8-9 abovee you will see that there are EIGHT grey levels in the input image, and only SIX in the output (0, 2 are missing). Thus information was destroyed in the transformation. Thus features that were previously evident, may now be obscured. (ii) the purpose of histogram modification is to better match the picture to the human visual system - i.e. stretch the contrast. In a machine the most common activity will be comparing the values of two pixels; transforming the input (in the manner described above) cannot improve the outcome - but it can disimprove the outcome - i.e. by making pixels equal that were not equal in the input. There may be some case where the reason for the transformation is to do with the physics of the problem - for example, see the X-ray example in notes. Ex. 4.12-2. Given an input image f[rl..rh][cl..ch], an output image g[rl..rh][cl..ch], a filter mask (impulse response): h[-w..+w][-v..+v], write a computer program fragment to perform eqn. 4.12-1. Hint 1: there will be FOUR nested loops. ***ANS Start off with: r+w c+v (4.12-1) g(r,c) = # # f(k,l).h(r-k,c-l) k=r-w l=c-v where h() are the weights. NOTE: this assumes that h[] is stored in an a rectangular array h[-w..+w][-v..+v]. In most computer languages an array must start at 0 or 1; in C this is 0. Thus, in C the formula would have to read ...f[k][l]*h[r-k+w][c-l+v] so that the indices go from 0..w*2 + 1, 0..v*2 + 1 Code: (in C) for(r= rl+w;r<= rh-w; r++) for(c=cl+v;c<=ch-v;c++){ sum=0.0; for(k=r-w;k<=r+w;k++) for(l=c-v;l<=c+w;l++) sum+=f[k][l]*h[r-k+w][c-l+v]; g[r][c] = sum; } Test the fragment by dry running on an 'impulse' image. If you operate on an impulse with h[] you will get a copy of h[] where the impulse was. I.e. h[] is the impulse response of the operator. ------------------------------------------------------------------- Ex. 4.13-2 Perform 3x3 median filtering on the image given in Figure 4.13-1. Draw a picture of the result. What happens at corners? ANS: Median filtering removes all the 'spikes', and leaves the 'T' intact, EXCEPT it removes the corners. An averaging filter is not well suited to this sort of noise, OR this sort of image - where there are sharp edges, which presumably should be preserved. The noise is just smeared, not removed; the edges are blurred. ---------------------------------------------------------------- Ex. 4.16.8-1 Apply a Sobel vertical gradient operator to the image given in Figure 4.16.8-1. Draw a picture of the result. What happens at corners? Ex. 4.16.8-2 Apply a Sobel horizontal gradient operator to the image given in Figure 4.16.8-1. Draw a picture of the result. ANS: The Sobel operator gives 1,3, or 4 for positive edges, and -1, -3, -4 for negative edges; the magnitude depends on whether we are at the end or middle of an edge line. Usually sign does not matter; although, if you were following edges, using a similarity criterion, it might. Whan adding vertical and horiz. edges, the signs MUST be removed - otherwise you may lose edge points at corners etc. ---------------------------------------------------------------- Ex. 4.16.8-4. Given an input image f[rl..rh][cl..ch], an output image g[rl..rh][cl..ch], write a computer program fragment to apply the Sobel vertical operator. Hint: Look at your answer for Ex. 4.12-2. ***ANS h = 1 0 -1 2 0 -2 1 0 -1 Declare h[] as follows: int h[3][3] ={1,0,-1,2,0,-2,1,0,-1}; and use answer to Q.5. Note the way C stores such arrays. Of course, you can save time by avoiding multiplying by 1, and avoiding the 0s altogether - but that is a minor implementation concern, secondary to getting the program to work, and to making it readable. Given that C stores the array like it does, why is it better to have 'c', the second index, in the inner loop? ***ANS: If you are in a virtual memory system (e.g. paged), such an arrangement will generate far less vitual memory handling. Consider NR=NC=512=page size. Done the right way you may get away with 1 page fault. Done the wrong way, you may get 512 page faults. But, PCs don't have paging, I hear you say. Maybe, but, increasingly, they have caches - which operate in the same way to paging - only the pages are smaller. end of chapter 4.