感谢所有让我站起来的人。但是,我的cutSplice方法效率低下。我需要能够轻松地修改它,以便我可以执行相反的方法。
开发一个单独的Java类(LinkedDnaStrand),它使用链接的节点列表来表示支持拼接的DNA链。列表中的每个节点将包含一个或多个核苷酸(A,C,G或T)的字符串。该类将负责append()和cutSplice()等操作,这些操作模拟真实的限制酶处理。
package dnasplicing;
public class DnaSequenceNode {
public String dnaSequence;
public DnaSequenceNode previous;
public DnaSequenceNode next;
public DnaSequenceNode(String initialDnaSequence) {
dnaSequence = initialDnaSequence;
}
}
package dnasplicing;
public class LinkedDnaStrand implements DnaStrand {
int nodeCount = 0;
int appendCount = 0;
long nucleotideCount = 0;
String sequenceString;
DnaSequenceNode cursor, head, tail;
public LinkedDnaStrand(String dnaSequence) {
DnaSequenceNode newNode = new DnaSequenceNode(dnaSequence);
head = newNode;
cursor = head;
tail = head;
head.previous = null;
tail.previous = null;
sequenceString = dnaSequence;
nodeCount++;
}
public String toString() {
String result = "";
DnaSequenceNode n = head;
while (n != null) {
result += n.dnaSequence;
n = n.next;
}
return result;
}
@Override
public long getNucleotideCount() {
nucleotideCount = sequenceString.length();
return nucleotideCount;
}
@Override
public void append(String dnaSequence) {
if (dnaSequence != null && dnaSequence.length() > 0) {
tail.next = new DnaSequenceNode(dnaSequence);
tail.next.previous = tail;
tail = tail.next;
sequenceString += dnaSequence;
appendCount++;
nodeCount++;
}
}
@Override
public DnaStrand cutSplice(String enzyme, String splicee) {
boolean frontSplice = false;
boolean backSplice = false;
if (sequenceString.startsWith(enzyme)) {
frontSplice = true;
}
if (sequenceString.endsWith(enzyme)) {
backSplice = true;
}
String[] dnaParts = sequenceString.split(enzyme);
LinkedDnaStrand newLinkedStrand = null;
if (frontSplice == true) {
newLinkedStrand = new LinkedDnaStrand(splicee);
// newLinkedStrand.append(dnaParts[0]);
for (int i = 1; i < dnaParts.length; i++) {
newLinkedStrand.append(dnaParts[i]);
if (i < dnaParts.length - 1) {
newLinkedStrand.append(splicee);
}
}
} else {
newLinkedStrand = new LinkedDnaStrand(dnaParts[0]);
for (int index = 1; index < dnaParts.length; index++) {
newLinkedStrand.append(splicee);
newLinkedStrand.append(dnaParts[index]);
}
}
if (backSplice == true) {
newLinkedStrand.append(splicee);
}
// sequenceString = newLinkedStrand.toString();
return newLinkedStrand;
}
@Override
public DnaStrand createReversedDnaStrand() {
// TODO Auto-generated method stub
return null;
}
@Override
public int getAppendCount() {
// TODO Auto-generated method stub
return appendCount;
}
@Override
public DnaSequenceNode getFirstNode() {
return head;
}
@Override
public int getNodeCount() {
return nodeCount;
}
}
package dnasplicing;
public interface DnaStrand {
/**
* NOTE: Your LinkedDnaStrand class must have a constructor that takes one parameter: String dnaSequence. When the
* constructor completes, your linked list should have just one node, and it should contain the passed-in
* dnaSequence. For example, if the following line of code was executed:
*
* LinkedDnaStrand strand = new LinkedDnaStrand("GATTACA");
*
* Then strand's linked list should look something like (previous pointers not shown):
*
* first -> "GATTACA" -> null
*
* The first line of this constructor should look like:
*
* public LinkedDnaStrand(String dnaSequence) {
*/
/**
* @return The entire DNA sequence represented by this DnaStrand.
*/
public String toString();
/**
* Returns the number of nucleotides in this strand.
*
* @return the number of base-pairs in this strand
*/
public long getNucleotideCount();
/**
* Appends the given dnaSequence on to the end of this DnaStrand. appendCount is incremented. Note: If this
* DnaStrand is empty, append() should just do the same thing as the constructor. In this special case, appendCount
* is not incremented.
*
* @param dnaSequence
* is the DNA string to append
*/
public void append(String dnaSequence);
/**
* This method creates a <bold>new</bold> DnaStrand that is a clone of the current DnaStrand, but with every
* instance of enzyme replaced by splicee. For example, if the LinkedDnaStrand is instantiated with "TTGATCC", and
* cutSplice("GAT", "TTAAGG") is called, then the linked list should become something like (previous pointers not
* shown):
*
* first -> "TT" -> "TTAAGG" -> "CC" -> null
*
* <b>NOTE</b>: This method will only be called when the linke list has just one node, and it will only be called
* once for a DnaStrand. This means that you do not need to worry about searching for enzyme matches across node
* boundaries.
*
* @param enzyme
* is the DNA sequence to search for in this DnaStrand.
*
* @param splicee
* is the DNA sequence to append in place of the enzyme in the returned DnaStrand
*
* @return A <bold>new</bold> strand leaving the original strand unchanged.
*/
public DnaStrand cutSplice(String enzyme, String splicee);
/**
* Returns a <bold>new</bold> DnaStrand that is the reverse of this strand, e.g., if this DnaStrand contains "CGAT",
* then the returned DnaStrand should contain "TAGC".
*
* @return A <bold>new</bold> strand containing a reversed DNA sequence.
*/
public DnaStrand createReversedDnaStrand();
/**
*
* @return The number of times that the DnaStrand has been appended via a call to append() or during the cutSplice()
* operation. Note that the very first time that a DnaStrand is given a DNA sequence is not to be counted as
* an append.
*/
public int getAppendCount();
/**
* This is a utility method that allows the outside world direct access to the nodes in the linked list.
*
* @return The first DnaSequenceNode in the linked list of nodes.
*/
public DnaSequenceNode getFirstNode();
/**
* This is a utility method that allows the outside world to determine the number of nodes in the linked list.
*
* @return
*/
public int getNodeCount();
}
package sbccunittest;
import static java.lang.Math.*;
import static java.lang.System.*;
import static org.apache.commons.lang3.StringUtils.*;
import static org.junit.Assert.*;
import java.io.*;
import org.apache.commons.lang3.*;
import org.junit.*;
import dnasplicing.*;
// Updated 25-Feb-2016 at 6:10pm
public class LinkedDnaStrandTester {
static String ecoliSmall = "AGCTTTTCATTAGCCCGCAGGCAGCCCCACACCCGCCGCCTCCTGCACCGAGAGAGATGGAATAAAGCCCTTGAACCAGC";
static String ecor1 = "GAATTC"; // restriction enzyme
public static int totalScore = 0;
public static int extraCredit = 0;
public static InputStream defaultSystemIn;
public static PrintStream defaultSystemOut;
public static PrintStream defaultSystemErr;
public static String newLine = System.getProperty("line.separator");
public void testCutSplice() {
String enzyme = "GAT";
String splicee = "TTAAGG";
String[] strands = { "TTGATCC", "TCGATCTGATTTCCGATCC", "GATCTGATCTGAT" };
String[][] recombinants = { { "TT", "TTAAGG", "CC" },
{ "TC", "TTAAGG", "CT", "TTAAGG", "TTCC", "TTAAGG", "CC" },
{ "TTAAGG", "CT", "TTAAGG", "CT", "TTAAGG" } };
for (int ndx = 0; ndx < strands.length; ndx++) {
LinkedDnaStrand linkedStrand = new LinkedDnaStrand(strands[ndx]);
DnaStrand newlinkedStrand = linkedStrand.cutSplice(enzyme, splicee);
assertEquals("cutSplice(" + enzyme + ", " + splicee + ") failed at ndx = " + ndx, join(recombinants[ndx]),
newlinkedStrand.toString());
assertEquals("Append counts didn't match for ndx = " + ndx, recombinants[ndx].length - 1,
newlinkedStrand.getAppendCount());
// Verify that each node contains the correct DNA sequence
DnaSequenceNode node = newlinkedStrand.getFirstNode();
for (int nodeNdx = 0; nodeNdx < recombinants.length; nodeNdx++) {
assertNotNull("For strand " + ndx + ", there is no node at position " + nodeNdx, node);
assertEquals("For strand " + ndx + ", the sequences don't match at position " + nodeNdx,
recombinants[ndx][nodeNdx], node.dnaSequence);
node = node.next;
}
}
totalScore += 5;
}
/**
* Verifies that LinkedDnaStrand can model a cut and splice of (part of) the E Coli sequence using the ECoR1
* restriction enzyme and insulin as a splicee.
*/
@Test
public void testSpliceInsulinIntoEcoli() {
for (int testNumber = 1; testNumber <= 5; testNumber++) {
int startNdx = (int) (random() * 0.33 * ecoliSmall.length()); // Somewhere in the
// first third
int endNdx = ecoliSmall.length() - 1 - (int) (random() * 0.33 * ecoliSmall.length());
String ecoliPart = ecoliSmall.substring(startNdx, endNdx);
LinkedDnaStrand linkedStrand = new LinkedDnaStrand(ecoliPart);
SimpleDnaStrand simpleStrand = new SimpleDnaStrand(ecoliPart);
DnaStrand newL = linkedStrand.cutSplice(ecor1, insulin);
DnaStrand newS = simpleStrand.cutSplice(ecor1, insulin);
assertEquals(newS.toString(), newL.toString());
assertEquals(newS.getAppendCount(), newL.getAppendCount());
assertEquals(newS.getAppendCount(), newL.getNodeCount() - 1);
// Verify that the nodes exist
DnaSequenceNode node = newL.getFirstNode();
for (int ndx = 0; ndx < newL.getNodeCount(); ndx++) {
assertNotNull("There is no node at position " + ndx, node);
node = node.next;
}
}
totalScore += 10;
}
/**
* Verifies that LinkedDnaStrand can model a cut and splice efficiently.
*/
@Test
public void testSplicingTime() {
// First verify that the LinkedDnaStrand cutSplice works
int startNdx = (int) (random() * 0.33 * ecoliSmall.length()); // Somewhere in the first
// third
int endNdx = ecoliSmall.length() - 1 - (int) (random() * 0.33 * ecoliSmall.length());
String ecoliPart = ecoliSmall.substring(startNdx, endNdx);
LinkedDnaStrand linkedStrand = new LinkedDnaStrand(ecoliPart);
SimpleDnaStrand simpleStrand = new SimpleDnaStrand(ecoliPart);
String splicee = createRandomDnaSequence(1024 * 1024, 1024 * 1024);
DnaStrand newL = linkedStrand.cutSplice(ecor1, splicee);
DnaStrand newS = simpleStrand.cutSplice(ecor1, splicee);
assertEquals(newS.toString(), newL.toString());
assertEquals(newS.getAppendCount(), newL.getAppendCount());
// Now verify that it can cut and splice N times in less than T seconds
int numSplicings = 200;
double maxTime = 2.0;
double start = nanoTime();
for (int i = 0; i < numSplicings; i++)
newL = linkedStrand.cutSplice(ecor1, splicee);
double end = nanoTime();
double time = ((end - start) / 1e9);
// out.println("Time = " + time);
assertTrue("Time limit of " + maxTime + " seconds exceeded. Time to splice " + numSplicings + " times was "
+ time + " seconds.", time <= maxTime);
totalScore += 5;
}
/**
* Verifies that LinkedDnaStrand can create a new, reversed LinkedDnaStrand.
*/
@Test
public void testReverse() {
String dnaSequence = createRandomDnaSequence(50, 100);
String dnaToAppend = createRandomDnaSequence(5, 10);
int numTimesToAppend = (int) (random() * 10);
LinkedDnaStrand linkedStrand = new LinkedDnaStrand(dnaSequence);
SimpleDnaStrand simpleStrand = new SimpleDnaStrand(dnaSequence);
for (int ndx = 0; ndx < numTimesToAppend; ndx++) {
linkedStrand.append(dnaToAppend);
simpleStrand.append(dnaToAppend);
}
assertEquals(simpleStrand.toString(), linkedStrand.toString());
assertEquals(numTimesToAppend + 1, linkedStrand.getNodeCount());
LinkedDnaStrand rl = (LinkedDnaStrand) linkedStrand.createReversedDnaStrand();
// Verify that the original linked strand wasn't changed
DnaSequenceNode node = linkedStrand.getFirstNode();
int nodeNdx = 0;
while (node != null) {
assertEquals("Sequences don't match at node index " + nodeNdx, nodeNdx == 0 ? dnaSequence : dnaToAppend,
node.dnaSequence);
node = node.next;
nodeNdx++;
}
// Verify that the new strand string is reversed
assertEquals(simpleStrand.createReversedDnaStrand().toString(), rl.toString());
totalScore += 10;
// If the new strand has a reverse order of nodes and sequences within each node, give extra
// credit
int numNodes = linkedStrand.getNodeCount();
if (numNodes == rl.getNodeCount()) {
// Build array of reversed dna strings from original LinkedDnaStrand. Start at end of
// array and move toward
// start
node = linkedStrand.getFirstNode();
String[] reversedDnaSequences = new String[linkedStrand.getNodeCount()];
nodeNdx = numNodes - 1;
while (node != null) {
reversedDnaSequences[nodeNdx] = reverse(node.dnaSequence);
node = node.next;
nodeNdx--;
}
// Verify that the reversed list's nodes contain the same data as in the array
node = rl.getFirstNode();
nodeNdx = 0;
while (node != null) {
if (!node.dnaSequence.equals(reversedDnaSequences[nodeNdx]))
break;
node = node.next;
nodeNdx++;
}
if (nodeNdx == linkedStrand.getNodeCount())
extraCredit += 5;
}
}
private String[] createRandomDnaSequences(int numDnaSequences, int minLength, int maxLength) {
String[] dnaSequences = new String[numDnaSequences];
for (int ndx = 0; ndx < numDnaSequences; ndx++)
dnaSequences[ndx] = createRandomDnaSequence(minLength, maxLength);
return dnaSequences;
}
private String createRandomDnaSequence(int minLength, int maxLength) {
return RandomStringUtils.random((int) (random() * (maxLength - minLength) + minLength), "ACGT");
}
@BeforeClass
public static void beforeTesting() throws Exception {
totalScore = 0;
extraCredit = 0;
}
@AfterClass
public static void afterTesting() {
out.println("Estimated score (w/o late penalties, etc.) = " + totalScore);
out.println("Estimated extra credit (assuming on time submission) = " + extraCredit);
}
@Before
public void setUp() throws Exception {
defaultSystemIn = System.in;
defaultSystemOut = System.out;
defaultSystemErr = System.err;
}
@After
public void tearDown() throws Exception {
System.setIn(defaultSystemIn);
System.setOut(defaultSystemOut);
System.setErr(defaultSystemErr);
}
public void sendToStdinOfTestee(String message) {
System.setIn(new ByteArrayInputStream(message.getBytes()));
}
}
答案 0 :(得分:-1)
所以我没有像你正在使用的节点类这样的类,所以我不能写实际代码,但我会给你一些伪代码。
public String toString(){
String result = "";
Node n = start;
while(n != null){
string += n.data;
}
return result;
}
public long getNucleotideCount() {
long result = 0;
Node n = start;
while(n != null){
result += n.data.length();
}
return result;
}
public void append(String dnaSequence) {
end.next = new Node(dnaSequence);
end = end.next;
appendCount++;
}
public DnaStrand cutSplice(String enzyme, String splice) {
// For this I think it would be best to assemble into string then use replace
Node n = start;
String result = "";
while(n != null){
result += n.data;
}
result = result.replace(enzyme, splice)
return new DnaStrand(result);
}
反向方法与cutSplice类似。组装成串,操纵,然后返回新的链。如果您需要更多帮助LMK。