#Make 1 library.csv for a pair of ATAC and RNA seq accession numbersdef read_rna_accession(): with open('rna_accessions.txt') as f: samples = [sample for sample in f.read().split('\n') if len(sample) > 0] # Remove empty lines return samplesdef read_atac_accession(): with open('atac_accessions.txt') as f: samples = [sample for sample in f.read().split('\n') if len(sample) > 0] # Remove empty lines return samples# Read ATAC and RNA accession IDsatac_SRRs = read_atac_accession()rna_SRRs = read_rna_accession()# Define all rule for generating libraries.csvrule all: input: expand("{atac_srr}_{rna_srr}_libraries.csv", atac_srr=atac_SRRs, rna_srr=rna_SRRs)# Rule to create libraries.csvrule create_libraries_csv: output:"{atac_srr}_{rna_srr}_libraries.csv" run: atac_srr = wildcards.atac_srr rna_srr = wildcards.rna_srr with open(output[0], "w") as f: f.write("fastqs,sample,library_type\n") f.write(f"atac_seq/{atac_srr},{atac_srr},Chromatin Accessibility\n") f.write(f"rna_seq/{rna_srr},{rna_srr},Gene Expression\n")
Hello, I am trying to make a libraries.csv file for a pair of atac_seq and rna_seq accession numbers {SRRXXXXXX}. For now, I have 2 SRR for atac_seq, and 2 SRR for rna_seq; I want to generate a libraries.csv file for each pair, so I should end up with 2 {atac_srr}_{rna_srr}libraries.csv file.
With the code that I have, it generates it for all the possible combinations between the atac_seq and rna_seq accession numbers; I would like to generate it sequentially, meaning that the first SRR in the atac acession numbers list corresponds to the first element in the SRR in the rna accession numbers list; is there a possible way to do this?