Running the optimize archR Workflow | Scribe

    Running the optimize archR Workflow

    • Jen Garbarino |
    • 21 steps |
    • 3 minutes
    information ordinal icon
    Optimize archR is used to test clustering parameters and filtering before performing the full analysis. It will generate plots showing the clustering spatially and in UMAP space as well as give show the TSS and number of fragments on average and spatially. This workflow allows for the quickly testing of different clustering parameters and filtering with minimal credits by not performing many of the much more computationally expensive outputs of the create archRProject workflow such as peak calling and motif enrichment.
    1
    Click on the workflows button in LatchBio to go to the workflows tab
    2
    Click on the optimize archr workflow
    3
    To add your sample click add a row
    4
    Enter the Run_id here
    alert ordinal icon
    Run_id should not have any spaces and cannot start with or be a single number ex) 1 or Sample 1 would cause errors but Sample_1 will work
    5
    Next associate the fragments file for this Run_id
    6
    To find the fragment file in LatchBio, click select file. Navigate to the fragment files and click "Select".
    To find the fragment file in LatchBio, click select file. Navigate to the fragment files and click "Select".
    information ordinal icon
    Service run fragment files will be located in the Raw_Data folder.
    7
    Enter Condition if applicable to samples. You can leave this blank.
    alert ordinal icon
    No spaces allowed within condition. If you have two conditions you can separate with a dash (dash cannot be used within condition). example: "post_treatment-non_resonder" will give you two conditions: post_treatment and non-responder. "post treatment-non-responder" will lead to errors.
    8
    Next we will input the location of the spatial directory. This folder is generated by AtlasXbrowser and is needed to connect the data to the imaging done of the sample.
    9
    Click "Select Folder"\ Navigate to the location of the spatial folders. Select the folder called spatial within the Run_ID labelled folder
    Click "Select Folder"\
Navigate to the location of the spatial folders. Select the folder called spatial within the Run_ID labelled folder
    information ordinal icon
    Service spatial folders will be located in the Raw_Data folder
    10
    We also will associate the position file for the sample here. This file determines where on the image the data is located and which datapoints contain tissue and should be included in the analysis.
    11
    Click "tissue_positions_list.csv" to specify the tissue position list within the folder and click "Select".
    Click "tissue_positions_list.csv" to specify the tissue position list within the folder and click "Select".
    12
    Set genome for experiment (mm10-mouse, hg38-human) contact AtlasXomics if you need an additional reference genomes added. If using the tutorial or demo datasets, use the mouse genome
    13
    Enter a project name to call this workflow execution. Project name cannot contain spaces and will be the name of the output folder for this execution.
    information ordinal icon
    Next we will test different clustering parameters to choose the best clustering for our experiment. There are three parameters to adjust which are explained in steps 14-16. For more details on how clustering works in archR go to: <https://www.archrproject.com/bookdown/iterative-latent-semantic-indexing-lsi.html>
    14
    LSI (Latent Semantic Indexing) resolution helps determine how the points in UMAP space are distributed. This is from a scale of 0-1 with lower numbers being more conservative. Selecting the right resolution is needed for capturing dataset complexity without oversimplifying, often requiring experimentation to find the optimal balance. You can try multiple values by pressing the + LSI resolution button