Standard Preprocess API

This script formats data for reconstruction according to configuration.

cohere_core.standard_preprocess.prep(beamline_full_datafile_name, auto, **kwargs)
This function formats data for reconstruction and saves it in data.tif file. The preparation consists of the following steps:
  • removing the alien: aliens are areas that are effect of interference. The area is manually set in a configuration file after inspecting the data. It could be also a mask file of the same dimensions that data. Another option is AutoAlien1 algorithm that automatically removes the aliens.

  • clearing the noise: values below an amplitude threshold are set to zero

  • amplitudes are set to sqrt

  • cropping and padding. If the adjust_dimention is negative in any dimension, the array is cropped in this dimension. The cropping is followed by padding in the dimensions that have positive adjust dimension. After adjusting, the dimensions are adjusted further to find the smallest dimension that is supported by opencl library (multiplier of 2, 3, and 5).

  • centering - finding the greatest amplitude and locating it at a center of array. If shift center is defined, the center will be shifted accordingly.

  • binning - adding amplitudes of several consecutive points. Binning can be done in any dimension.

Parameters:
  • beamline_full_datafile_name (str) – full name of tif file containing beamline preprocessed data

  • kwargs (keyword arguments) –

    data_dirstr

    directory where prepared data will be saved, default <experiment_dir>/phasing_data

    alien_algstr

    Name of method used to remove aliens. Possible options are: ‘block_aliens’, ‘alien_file’, and ‘AutoAlien1’. The ‘block_aliens’ algorithm will zero out defined blocks, ‘alien_file’ method will use given file as a mask, and ‘AutoAlien1’ will use auto mechanism to remove aliens. Each of these algorithms require different parameters

    alienslist

    Needed when the ‘block_aliens’ method is configured. Used when the data contains regions with intensity produced by interference. The regions needs to be zeroed out. The aliens can be defined as regions each defined by coordinates of starting point, and ending point (i.e. [[xb0,yb0,zb0,xe0,ye0,ze0],[xb1,yb1,zb1,xe1,ye1,ze1],…[xbn,ybn,zbn,xen,yen,zen]] ).

    alien_filestr

    Needed when the ‘alien_file’ method is configured. User can produce a file in npy format that contains table of zeros and ones, where zero means to set the pixel to zero, and one to leave it.

    AA1_size_thresholdfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 0.01. The AutoAlien1 algorithm will calculate relative sizes of all clusters with respect to the biggest cluster. The clusters with relative size smaller than the given threshold will be possibly deemed aliens. It also depends on asymmetry.

    AA1_asym_thresholdfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 1.75. The AutoAlien1 algorithm will calculate average asymmetry of all clusters. The clusters with average asymmetry greater than the given threshold will be possibly deemed aliens. It also depends on relative size.

    AA1_min_ptsint

    Used in the ‘AutoAliens1’ method. If not given it will default to 5. Defines minimum non zero points in neighborhood to count the area of data as cluster.

    AA1_epsfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 1.1. Used in the clustering algorithm.

    AA1_amp_thresholdfloat

    Mandatory in the ‘AutoAliens1’ method. Used to zero data points below that threshold.

    AA1_save_arrsboolean

    Used in the ‘AutoAliens1’ method, optional. If given and set to True multiple results of alien analysis will be saved in files.

    AA1_expandcleanedsigmafloat

    Used in the ‘AutoAliens1’ method, optional. If given the algorithm will apply last step of cleaning the data using the configured sigma.

    intensity_thresholdfloat

    Mandatory, min data threshold. Intensity values below this are set to 0. The threshold is applied after removing aliens.

    adjust_dimensionslist

    Optional, a list of number to adjust the size at each side of 3D data. If number is positive, the array will be padded. If negative, cropped. The parameters correspond to [x left, x right, y left, y right, z left, z right] The final dimensions will be adjusted up to the good number for the FFT which also is compatible with opencl supported dimensions powers of 2 or a*2^n, where a is 3, 5, or 9

    center_shiftlist

    Optional, enter center shift list the array maximum is centered before binning, and moved according to center_shift, [0,0,0] has no effect

    binninglist

    Optional, a list that defines binning values in respective dimensions, [1,1,1] has no effect.

    do_auto_binningboolean

    Optional, mandatory if auto_data is True. is True the auto binning wil be done, and not otherwise.

    debugboolean

    It’s a command line argument passed as parameter. If True, ignores verifier error.