The TRC shRNA Design Process

Overview

A brief narrative of the candidate selection process

Current Rule Set

Rule Set 9

Rule Description
1 aaStart9 Exclude any candidate beginning with AA (score = 0)
2 fourRow9 Exclude any candidate containing a run of four of the same base in a row (score = 0)
3 gcScore9 Exclude candidates with extreme GC percentage (GC <= 25% or > 60%); promote candidates with GC between 25-55% (score = 3); if GC > 55% and <= 60% then score = 1 (neutral)
4 nonGATC9 Exclude any candidate containing ambiguous bases (e.g. N) (score = 0)
5 restrictionSite9 Exclude any candidate containing certain restriction sites: ...GGTACC..., ...GAATTC..., ...CTCGAG..., ...CATATG..., ...ACTAGT..., ...GGTAC, ...GAATT, GTACC..., TACC..., CTAGT...
6 sevenGC9 Exclude any candidate with a run of 7 C/G bases (score = 0)
7 stemLoopStem Penalize candidates that can form an internal stem-loop (score = 0.1) (minimum stem length = 5, minimum loop size = 4)
8 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Previous Rule Sets

Rule Set 8

Rule Description
1 aaStart AAstart; candidates beginning with AA get a penalty of .000000000000001;
2 fourRow fourInARow; any four of the same bases in a row gets the penalty of 0.01
3 gcScore8 gcContent: extremes of GC percentage are penalized; candidates with GC <= 25% or > 60% are penalized .01; with GC between 25-55% the candidate gets a reward of 3; with GC >550 and <=60% the score is 1 (neutral)
4 nonGATC no ambiguous bases allowed in the candidate 21mer sequence
5 restrictionSite8 GGTACC, GAATTC, CTCGAG, CATATG, ACTAGT, ...GGTAC, ...GAATT
6 sevenGC sevenGC; any run of 7 C or G gets the penalty of 0.01
7 stemLoopStem Penalize candidates that can form an internal stem-loop (score = 0.1) (minimum stem length = 5, minimum loop size = 4)
8 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Rule Set 7

Rule Description
1 aaStart AAstart; candidates beginning with AA get a penalty of .000000000000001;
2 fivePrimeClamp fivePrimeClamp:give precedence to a candidates with stronger base-pairing at the 5 prime end of the putative candidate, referred to as five_prime_clamp; penalty/reward .01 if first two positions are GG, .0001 if first two are TT; 2.5 if first four are (G|C){4}; 2.4 if first three positions are G|C{3}; 2.2 if begins (CC|CG|GC)(A|T)(G|C); 2 if begins (CC|CG|GC); 2 if begins (GC); 1.25 if begins (G|C); 1 if begins (A|T)(G|C); .5 if begins ((A|T){2}
3 fourRow fourInARow; any four of the same bases in a row gets the penalty of 0.01
4 gcScore gcContent: extremes of GC percentage are penalized; candidates with GC \< 30% are penalized .01; with > 70% the penalty is .01; with GC between 30-50% the candidate gets a reward of 3; with GC >60 and \<70% the reward/penalty is 1
5 internalAT internalAT; we want to reward moderately AT rich regions from 7 through 10; if all four are A|T, rewards is 2.2; if 3 of 4 are A|T, the reward is 2, if 2 of 4 is A|T, the reward is 1.5; if 1 or 4 is A|T, the penalty is .7; if none of the four are A|T, the penalty is 0.5
6 internalATFlanking internalATflank; we want to reward moderately AT-rich sequences at position 6 and 11; if both are AT, the reward is 1.2; if 1 is either A|T, the reward is 1 and if neither is A|T, the penalty is 0.85
7 internalLoop internalLoop: we penalize candidates that cand form a AAABBB loop with a 0.7 penalty
8 nonGATC no ambiguous bases allowed in the candidate 21mer sequence
9 restrictionSite GCCGGC, CCCGGG, CTCGAG, ...GCCGG
10 sevenGC sevenGC; any run of 7 C or G gets the penalty of 0.01
11 threePrimeClamp6 Give precedence to candidates with weaker base-pairing at positions 15-20 (priority on pos. 17-19); score = 5 if all 6 positions are A or T, decreasing to 0.1 if all 6 are G/C. Score drops off steeply as the number of A/T bases decreases.

Rule Set 4

Rule Description
1 aaStart AAstart; candidates beginning with AA get a penalty of .000000000000001;
2 fivePrimeClamp fivePrimeClamp:give precedence to a candidates with stronger base-pairing at the 5 prime end of the putative candidate, referred to as five_prime_clamp; penalty/reward .01 if first two positions are GG, .0001 if first two are TT; 2.5 if first four are (G|C){4}; 2.4 if first three positions are G|C{3}; 2.2 if begins (CC|CG|GC)(A|T)(G|C); 2 if begins (CC|CG|GC); 2 if begins (GC); 1.25 if begins (G|C); 1 if begins (A|T)(G|C); .5 if begins ((A|T){2}
3 fourRow fourInARow; any four of the same bases in a row gets the penalty of 0.01
4 gcScore gcContent: extremes of GC percentage are penalized; candidates with GC \< 30% are penalized .01; with > 70% the penalty is .01; with GC between 30-50% the candidate gets a reward of 3; with GC >60 and \<70% the reward/penalty is 1
5 internalAT internalAT; we want to reward moderately AT rich regions from 7 through 10; if all four are A|T, rewards is 2.2; if 3 of 4 are A|T, the reward is 2, if 2 of 4 is A|T, the reward is 1.5; if 1 or 4 is A|T, the penalty is .7; if none of the four are A|T, the penalty is 0.5
6 internalATFlanking internalATflank; we want to reward moderately AT-rich sequences at position 6 and 11; if both are AT, the reward is 1.2; if 1 is either A|T, the reward is 1 and if neither is A|T, the penalty is 0.85
7 internalLoop internalLoop: we penalize candidates that cand form a AAABBB loop with a 0.7 penalty
8 nonGATC no ambiguous bases allowed in the candidate 21mer sequence
9 sevenGC sevenGC; any run of 7 C or G gets the penalty of 0.01
10 threePrimeClamp threePrimeClamp: give precedence to a candidates with weaker base-pairing at the 3 prime end of the putative candidate; penalty/reward 5 if last three positions are A or T, 4.5 if last two are A|T and third from is G|C and fourth is A|T; 4 if the last two are A|T; 2 if the last base is A|T; penalty is .2 if last two posisitions are G|C; .5 if the last base is G|C; 0.8 if the last base is G|C and previous two are A|T