Standard Preprocess API¶
This script formats data for reconstruction according to configuration.
- cohere_core.standard_preprocess.prep(beamline_full_datafile_name, auto, **kwargs)¶
- This function formats data for reconstruction and saves it in data.tif file. The preparation consists of the following steps:
removing the alien: aliens are areas that are effect of interference. The area is manually set in a configuration file after inspecting the data. It could be also a mask file of the same dimensions that data. Another option is AutoAlien1 algorithm that automatically removes the aliens.
clearing the noise: values below an amplitude threshold are set to zero
amplitudes are set to sqrt
cropping and padding. If the adjust_dimention is negative in any dimension, the array is cropped in this dimension. The cropping is followed by padding in the dimensions that have positive adjust dimension. After adjusting, the dimensions are adjusted further to find the smallest dimension that is supported by opencl library (multiplier of 2, 3, and 5).
centering - finding the greatest amplitude and locating it at a center of array. If shift center is defined, the center will be shifted accordingly.
binning - adding amplitudes of several consecutive points. Binning can be done in any dimension.
- Parameters:
beamline_full_datafile_name (str) – full name of tif file containing beamline preprocessed data
kwargs (keyword arguments) –
- data_dirstr
directory where prepared data will be saved, default <experiment_dir>/phasing_data
- alien_algstr
Name of method used to remove aliens. Possible options are: ‘block_aliens’, ‘alien_file’, and ‘AutoAlien1’. The ‘block_aliens’ algorithm will zero out defined blocks, ‘alien_file’ method will use given file as a mask, and ‘AutoAlien1’ will use auto mechanism to remove aliens. Each of these algorithms require different parameters
- alienslist
Needed when the ‘block_aliens’ method is configured. Used when the data contains regions with intensity produced by interference. The regions needs to be zeroed out. The aliens can be defined as regions each defined by coordinates of starting point, and ending point (i.e. [[xb0,yb0,zb0,xe0,ye0,ze0],[xb1,yb1,zb1,xe1,ye1,ze1],…[xbn,ybn,zbn,xen,yen,zen]] ).
- alien_filestr
Needed when the ‘alien_file’ method is configured. User can produce a file in npy format that contains table of zeros and ones, where zero means to set the pixel to zero, and one to leave it.
- AA1_size_thresholdfloat
Used in the ‘AutoAliens1’ method. If not given it will default to 0.01. The AutoAlien1 algorithm will calculate relative sizes of all clusters with respect to the biggest cluster. The clusters with relative size smaller than the given threshold will be possibly deemed aliens. It also depends on asymmetry.
- AA1_asym_thresholdfloat
Used in the ‘AutoAliens1’ method. If not given it will default to 1.75. The AutoAlien1 algorithm will calculate average asymmetry of all clusters. The clusters with average asymmetry greater than the given threshold will be possibly deemed aliens. It also depends on relative size.
- AA1_min_ptsint
Used in the ‘AutoAliens1’ method. If not given it will default to 5. Defines minimum non zero points in neighborhood to count the area of data as cluster.
- AA1_epsfloat
Used in the ‘AutoAliens1’ method. If not given it will default to 1.1. Used in the clustering algorithm.
- AA1_amp_thresholdfloat
Mandatory in the ‘AutoAliens1’ method. Used to zero data points below that threshold.
- AA1_save_arrsboolean
Used in the ‘AutoAliens1’ method, optional. If given and set to True multiple results of alien analysis will be saved in files.
- AA1_expandcleanedsigmafloat
Used in the ‘AutoAliens1’ method, optional. If given the algorithm will apply last step of cleaning the data using the configured sigma.
- intensity_thresholdfloat
Mandatory, min data threshold. Intensity values below this are set to 0. The threshold is applied after removing aliens.
- adjust_dimensionslist
Optional, a list of number to adjust the size at each side of 3D data. If number is positive, the array will be padded. If negative, cropped. The parameters correspond to [x left, x right, y left, y right, z left, z right] The final dimensions will be adjusted up to the good number for the FFT which also is compatible with opencl supported dimensions powers of 2 or a*2^n, where a is 3, 5, or 9
- center_shiftlist
Optional, enter center shift list the array maximum is centered before binning, and moved according to center_shift, [0,0,0] has no effect
- binninglist
Optional, a list that defines binning values in respective dimensions, [1,1,1] has no effect.
- do_auto_binningboolean
Optional, mandatory if auto_data is True. is True the auto binning wil be done, and not otherwise.
- debugboolean
It’s a command line argument passed as parameter. If True, ignores verifier error.