Running the clean workflow | Scribe

    Running the clean workflow

    • Jen Garbarino |
    • 14 steps |
    • 2 minutes
    information ordinal icon
    In our lane-based system, it is possible for a row or column to have higher counts compared to the rest of the data. This can cause artifacts when performing clustering. However, we can randomly downsample these lanes, aligning them with the others and maintaining a balanced dataset. This optional workflow will detect any outlier lanes and downsample them.
    1
    Click on the workflows button in LatchBio to go to the workflows tab
    2
    Click on the clean workflow
    3
    To add your sample click add a row
    4
    Enter the Run_id here
    alert ordinal icon
    Run_id should not have any spaces and cannot be a single number ex) 1 or Sample 1 would cause errors but Sample_1 will work
    5
    Enter the single cell file. This file is generated during preprocessing contains how many reads and fragments are associated with each barcode
    6
    Click select file and navigate to the singlecell.csv for your sample, located int he statistics folder of the preprocessing output.
    Click select file and navigate to the singlecell.csv for your sample, located int he statistics folder of the preprocessing output.
    7
    Input the position_file. This is used to determine which barcodes are associated with tissue.
    8
    Click "Select File" and navigate to the tissue_position_list.csv for your sample generated by AtlasXBrowser. Click Select.
    Click "Select File" and navigate to the tissue_position_list.csv for your sample generated by AtlasXBrowser. Click Select.
    9
    Associate the fragments_file to the sample
    10
    Click "Select File" and navigate to the fragment file in the chromap_output folder for the sample. Click "Select"
    Click "Select File" and navigate to the fragment file in the chromap_output folder for the sample. Click "Select"
    11
    Choose the name of the output directory for the cleaned fragment file. This will be the folder name located in the cleaned folder
    12
    In the Deviations field we indicate the number of standard deviations above from the median a row or column is to be considered an outlier. Typcially we use 1 or 2 for this value.
    information ordinal icon
    You can use the lane_qc pdf in the statistics folder to help determine this number as it will show the outliers for both 1 and 2 standard deviations. This initial calculation does not take in the position file so actual cleaned lanes may differ.
    13
    If you would like to run for multiple samples you can add additional rows.
    14
    When you have finished adding all the parameters click launch workflow. You will be taken to the execution page where you can track the progress of your workflow.