Standard Preprocess API

Formats data for reconstruction according to configuration.

cohere_core.data.standard_preprocess.prep(beamline_full_datafile_name, **kwargs)
This function formats data for reconstruction and saves it in data.tif file. The preparation consists of the following steps:
  • removing the aliens which is effect of interference. The removal can be done by setting regions or mask file that requires manual inspection of the data file. The removal can be automatic with the AutoAlien1 algorithm.

  • clearing the noise, where values below an amplitude threshold are set to zero. The threshold can be set as a parameter or auto determined.

  • amplitudes are set to sqrt

  • cropping and padding. If the crop-pad is negative in any dimension, the array is cropped in this dimension. The cropping is followed by padding in the dimensions with positive values. After adjusting, the dimensions are adjusted further to find the smallest dimension that is supported by opencl library (multiplier of 2, 3, and 5).

  • centering - finding the greatest amplitude and locating it at a center of array. If shift center is defined, the center will be shifted accordingly.

  • binning - adding amplitudes of several consecutive points. Binning can be done in any dimension.

Parameters:
  • beamline_full_datafile_name -- full path of tif file containing beamline preprocessed data

  • kwargs --

    data_dirstr

    directory where prepared data will be saved, default <experiment_dir>/phasing_data

    alien_algstr

    Acronym of method used to remove aliens. Possible options are: ‘block_aliens’, ‘alien_file’, and ‘AutoAlien1’. The ‘block_aliens’ algorithm will zero out defined blocks, ‘alien_file’ method will use given file as a mask, and ‘AutoAlien1’ will use auto mechanism to remove aliens. Each of these algorithms require different parameters

    alienslist

    Needed when the ‘block_aliens’ method is configured. Used when the data contains regions with intensity produced by interference. The regions needs to be zeroed out. The aliens can be defined as regions each defined by coordinates of starting point, and ending point (i.e. [[xb0,yb0,zb0,xe0,ye0,ze0],[xb1,yb1,zb1,xe1,ye1,ze1],…[xbn,ybn,zbn,xen,yen,zen]] ).

    alien_filestr

    Needed when the ‘alien_file’ method is configured. User can produce a file in npy format that contains table of zeros and ones, where zero means to set the pixel to zero, and one to leave it.

    AA1_size_thresholdfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 0.01. The AutoAlien1 algorithm will calculate relative sizes of all clusters with respect to the biggest cluster. The clusters with relative size smaller than the given threshold will be possibly deemed aliens. It also depends on asymmetry.

    AA1_asym_thresholdfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 1.75. The AutoAlien1 algorithm will calculate average asymmetry of all clusters. The clusters with average asymmetry greater than the given threshold will be possibly deemed aliens. It also depends on relative size.

    AA1_min_ptsint

    Used in the ‘AutoAliens1’ method. If not given it will default to 5. Defines minimum non zero points in neighborhood to count the area of data as cluster.

    AA1_epsfloat

    Used in the ‘AutoAliens1’ method. If not given it will default to 1.1. Used in the clustering algorithm.

    AA1_amp_thresholdfloat

    Mandatory in the ‘AutoAliens1’ method. Used to zero data points below that threshold.

    AA1_save_arrsboolean

    Used in the ‘AutoAliens1’ method, optional. If given and set to True multiple results of alien analysis will be saved in files.

    AA1_expandcleanedsigmafloat

    Used in the ‘AutoAliens1’ method, optional. If given the algorithm will apply last step of cleaning the data using the configured sigma.

    intensity_thresholdfloat

    Mandatory, min data threshold. Intensity values below this are set to 0. The threshold is applied after removing aliens.

    crop_padlist

    Optional, a list of number to adjust the size at each side of 3D data. If number is positive, the array will be padded. If negative, cropped. The parameters correspond to [x left, x right, y left, y right, z left, z right] The final dimensions will be adjusted up to the good number for the FFT which also is compatible with opencl supported dimensions powers of 2 multipled by powers of 3 multiplied by powers of5

    shiftlist

    Optional, enter center shift list the array maximum is centered before binning, and moved according to shift, [0,0,0] has no effect

    binninglist

    Optional, a list that defines binning values in respective dimensions, [1,1,1] has no effect.

    no_center_maxboolean, defaults to False

    True if the max is not centered

    next_fast_lenboolean, defaults to False

    Typically True, changes dimensions to numbers that allow fast fourier transform; depends on library

    pkgstring

    'cp' for cupy, 'torch' for torch, 'np' for numpy