When an isomorphous difference Patterson is calculated, GraphEnt will plot the normal probability diagram of the input data, together with a reference dotted line of gradient 1.0 and zero intercept12. The usage of the normal probability plots for accessing the usefulness (or otherwise) of a putative derivative is well documented and will not be discussed here (see Howell, P.L. & Smith, G.D. (1992), J. Appl. Cryst., 25, 81-86, and Abrahams, S.C. & Keve, E.T. (1971), Acta Cryst., A 27, 157-165). If you scaled your (macromolecular) data using the program scaleit from the CCP4 suite, then although you have not seen the plot, you have seen the variation of its gradient and intercept versus resolution (using the program xloggraph on the .log file written by scaleit). The reason for repeating the calculation here, is that the normal probability plot can also be used to select suspect data that do not fit an otherwise linear trend. The important thing is that the selection is not performed on the basis of just the magnitude of the difference (ie || FPH| - | FP||, as happens in scaleit), but on the basis of both the observed amplitudes and their standard deviations. The normal probability plot together with the ``large contributions to '' table (files CHIcontributions.dat and CHIcontributions.ps), which is produced after the calculation is over, should allow you to justifiably select outliers13.
This is achieved as follows : GraphEnt will write out an ASCII file (named Normplot_tails.dat which contains the hkl indeces for all reflections that comprise the tails of the plot. These points are shown in the graphics window with a different colour. If some of these points deviate significantly from the rest of the plot, then they are candidates for rejection (note that some deviation from linearity will always be present near the tails. What you are looking for is an outstanding deviation.)
You can then match what you see in the plot with what is written in the Normplot_tails.dat, decide which reflections to exclude, write their indeces in an ASCII file with the name REJECT.HKL, and then re-run the program using the MAXENT_AUTO.IN file after adding the keyword REJECT (see page ). Because this sounds quite complicated, I will now give a detailed example to show how it works :
We start with just one .mtz file containing data for a putative derivative :
crystal2 ~/test crystal2 ~/test d total 260 -rw-r--r-- 1 glykos sys 262300 Dec 16 15:45 from_scaleit.mtz crystal2 ~/test crystal2 ~/test mtzdump hklin from_scaleit.mtz ########################################################## ########################################################## ########################################################## ### CCP PROGRAM SUITE: MTZDUMP VERSION 3.5: 18/06/98## ########################################################## ............... OVERALL FILE STATISTICS for resolution range 0.001 - 0.245 ======================= Col Sort Min Max Num % Mean Mean Resolution Type Column num order Missing complete abs. Low High label 1 ASC -46 35 0 100.00 -11.3 18.0 35.81 2.02 H H 2 NONE 0 11 0 100.00 4.0 4.0 35.81 2.02 H K 3 NONE 0 31 0 100.00 12.3 12.3 35.81 2.02 H L 4 NONE 4.4 902.0 3 99.96 92.65 92.65 35.81 2.02 F FP 5 NONE 0.6 26.2 3 99.96 3.34 3.34 35.81 2.02 Q SIGFP 6 NONE 8.7 956.3 3500 51.49 137.07 137.07 18.78 2.50 F FPH 7 NONE 1.2 41.5 3500 51.49 7.95 7.95 18.78 2.50 Q SIGPH 8 NONE -73.2 72.3 3718 48.47 0.29 7.29 18.78 2.51 D DPH 9 NONE 0.0 66.8 3718 48.47 11.85 11.85 18.78 2.51 Q SIGDPH No. of reflections used in FILE STATISTICS 7215 LIST OF REFLECTIONS =================== ............... MTZDUMP: Normal termination of mtzdump Times: User: 0.2s System: 0.1s Elapsed: 0:03 crystal2 ~/test crystal2 ~/test
Then, we run GraphEnt on the centrosymmetric [010] projection :
crystal2 ~/test crystal2 ~/test GraphEnt h0l 10 3 from_scaleit.mtz ___________________________________________________________________________________________________________________________ ### ### ####### # ## ## # # # # # # # #### ### ### # # ## ### ###### # # # # # # # # # # ## # # # # # # # # #### # # # # # # ##### # # # # # # # # # # # # # # # # # # # # # # # # # # # # # ### ### #### # ### ### ####### ### ### ### Gull, S.F. & Daniell, G.J. (1978), Nature, 272, 686-690 Collins, D.M. (1982), Nature, 298, 49-51 NMG ___________________________________________________________________________________________________________________________ - Assuming that input is a .mtz file. Interpreting ... .............................................. - Now trying lambda = 0.010000 ............................................. - Initial value for lambda set to 1000.000000 ___________________________________________________________________________________________________________________________ - MAXENT starts here Chi**2 : 1593.822 R : 1.0000 Lambda : 1000.00000 Nobs : 366 Chi**2 : 1588.187 R : 0.9992 Lambda : 1000.00000 Nobs : 366 ........................................................................................ Chi**2 : 365.790 R : 0.5621 Lambda : 945.19320 Nobs : 366 803 cycles in 74 seconds, giving an average of 0.092 seconds per cycle. ___________________________________________________________________________________________________________________________ CONVERGENCE ACHIEVED. The final R-factor between the observed and calculated amplitudes is 0.5621040 ........................................ Normal termination ? (100 seconds)
Now we have all these files :
crystal2 ~/test d total 652 -rw-r--r-- 1 glykos sys 68 Dec 16 15:51 CHIcontributions.dat -rw-r--r-- 1 glykos sys 37224 Dec 16 15:51 CHIcontributions.ps -rw-r--r-- 1 glykos sys 31101 Dec 16 15:49 MAXENT_AUTO.IN -rw-r--r-- 1 glykos sys 30595 Dec 16 15:48 MAXENT_FROM_MTZ.in -rw-r--r-- 1 glykos sys 103 Dec 16 15:48 MAXENT_FROM_MTZ_ANOMALOUS.in -rw-r--r-- 1 glykos sys 10365 Dec 16 15:48 Normal_probability.ps -rw-r--r-- 1 glykos sys 825 Dec 16 15:48 Normplot_tails.dat -rw-r--r-- 1 glykos sys 132176 Dec 16 15:50 conventional.map -rw-r--r-- 1 glykos sys 262300 Dec 16 15:45 from_scaleit.mtz -rw-r--r-- 1 glykos sys 132176 Dec 16 15:51 maxent.map crystal2 ~/test
Both CHIcontributions.dat and Normplot_tails.dat point to problems with reflections 0,0,11 and -12,0,8 :
crystal2 ~/test crystal2 ~/test crystal2 ~/test more CHIcontributions.dat 0 0 11 55.19882 -12 0 8 59.04416 crystal2 ~/test crystal2 ~/test crystal2 ~/test more Normplot_tails.dat 0 0 6 -2.99385 -30.69588 2 0 4 -2.64107 -28.07780 -12 0 8 -2.46310 -26.91301 0 0 11 -2.34000 -25.43124 -4 0 8 -2.24461 -22.29077 0 0 7 -2.16611 -21.85669 4 0 10 -2.09905 -18.73302 -16 0 5 +2.04028 +10.47118 8 0 6 +2.09905 +10.55087 4 0 4 +2.16611 +10.55754 -8 0 9 +2.24461 +11.08962 6 0 3 +2.34000 +11.90654 4 0 6 +2.46310 +12.23197 -16 0 10 +2.64107 +12.45890 2 0 6 +2.99385 +13.12762 crystal2 ~/test crystal2 ~/test
The normal probability plot suggests that all seven reflections in the lower left-hand side corner are suspect. Its somewhat sigmoidal shape suggests the presence of non-normally distributed (systematic) errors :
Let's repeat the calculation but with these seven reflections excluded from the calculation. The first step is to create a file with the name REJECT.HKL whose first three columns contain the indeces of the reflections to be excluded :
crystal2 ~/test crystal2 ~/test cp Normplot_tails.dat REJECT.HKL crystal2 ~/test ed REJECT.HKL crystal2 ~/test more REJECT.HKL 0 0 6 -2.99385 -30.69588 2 0 4 -2.64107 -28.07780 -12 0 8 -2.46310 -26.91301 0 0 11 -2.34000 -25.43124 -4 0 8 -2.24461 -22.29077 0 0 7 -2.16611 -21.85669 4 0 10 -2.09905 -18.73302 crystal2 ~/test crystal2 ~/test
Then, we edit the file MAXENT_AUTO.IN and we add the keyword REJECT :
crystal2 ~/test crystal2 ~/test ed MAXENT_AUTO.IN crystal2 ~/test more -20 MAXENT_AUTO.IN REJECT CELL 94.14900 24.17000 64.31901 90.00000 130.36700 90.00000 SPACEGROUP 1 MAP_FORMAT CCP4 DIFF_PATT PERMUTATION 3 1 2 GRID 128 256 1 GRACYCLES 80 GRATWOWINDOWS REFLECTIONS -30 0 9 89.88602 3.43968 123.75751 12.84017 -30 0 10 126.17858 3.93975 110.84611 10.25688 -30 0 11 38.71215 5.14720 36.43570 15.66436 -30 0 12 165.68549 4.99690 154.67838 7.42726 -30 0 13 38.65771 4.30664 43.59790 16.74030 -30 0 14 158.72888 4.75254 159.23166 5.49528 -30 0 15 86.40644 3.25414 84.79811 15.51947 -30 0 16 150.11438 4.57498 146.66685 5.11194 -30 0 17 132.07582 4.08662 164.78131 5.11169 -30 0 18 21.89952 8.06613 23.18951 10.87039 ................................................................................. crystal2 ~/test crystal2 ~/test
... and we run it again, but this time giving as input the MAXENT_AUTO.IN file :
crystal2 ~/test crystal2 ~/test GraphEnt MAXENT_AUTO.IN ___________________________________________________________________________________________________________________________ ### ### ####### # ## ## # # # # # # # #### ### ### # # ## ### ###### # # # # # # # # # # ## # # # # # # # # #### # # # # # # ##### # # # # # # # # # # # # # # # # # # # # # # # # # # # # # ### ### #### # ### ### ####### ### ### ### Gull, S.F. & Daniell, G.J. (1978), Nature, 272, 686-690 Collins, D.M. (1982), Nature, 298, 49-51 NMG ___________________________________________________________________________________________________________________________ Keyword REJECT : 7 reflections specified in REJECT.HKL. Keyword CELL : Cell dimensions set to 94.15 24.17 64.32 90.00 130.37 90.00 Keyword SPACEGROUP : space group number set to 1 Keyword MAP_FORMAT : CCP4 map file selected. Keyword DIFF_PATT : Difference Patterson map run [h k l FP sig(FP) FPH sig(FPH)]. Keyword PERMUTATION : Permutation set to 3 1 2 Keyword GRID : Grid set to 128 256 1 Keyword GRACYCLES : Plot every 80 cycles. Keyword GRATWOWINDOWS : Will keep conventional map plot. Keyword REFLECTIONS : start reading reflections. Reflection rejected : -12 0 8 Reflection rejected : -4 0 8 Reflection rejected : 0 0 6 Reflection rejected : 0 0 7 Reflection rejected : 0 0 11 Reflection rejected : 2 0 4 Reflection rejected : 4 0 10 ___________________________________________________________________________________________________________________________ ........................................................................................................................... Normal termination ? (32 seconds)
NOTE WELL : Because the normal probability plot is calculated with
data expanded to P1, each point on the plot may actually correspond to a
superposition of several symmetry-equivalent reflections. When you reject
data, you MUST reject all symmetry equivalent reflections that are present
in your P1 data set. Failure to do so will show-up in your maps as absence
of the expected symmetry elements. Now : under normal circumstances the
Normplot_tails.dat file will contain all symmetry equivalent reflections,
except if these are near the assumed linear part of the plot. In this case,
I'm afraid that you will have to manually add the indeces of the missing
equivalents in the REJECT.HKL file (sorry).