无法在mapper函数上获取字符串索引

时间:2015-04-04 03:57:22

标签: hadoop

on getIndexes (int number , int size , int characters )

我必须在数组末尾添加转换后的数字,因为我必须应用0的填充,假设它的231 ...
这意味着我必须在开始时放置6个零,然后是20个。

//Input characters and Lenght of Motif
char [] inputChars = {'a','c','g','t'} ; 
int lengthOfMotif = 8 ;

public void map(Object key, Text value, Context context) throws IOException, InterruptedException
{

    /*
     * To generate all the combinations of lengthOfMotif i used formula or probability to count of all possible strings that will be power of ( inputChars length , elements in motif )
     * 
     * Then i called my getIndexes method to get int[] of length lengthOfMotif representing an index from inputChars
     * 
     * I generated a motif and made it key of Mapper and called minDistance returning minDistance,bestMatchingString,indexOfBestMatchingString and made it value for key motif
     * 
     * 
     */
    for (int i = 0 ; i < Math.pow(inputChars.length, lengthOfMotif) ; i ++ )
    {
        String motif = "" ; // initialize the empty motif string
        for ( int j : getIndexes ( i , lengthOfMotif , inputChars.length ) ) //loop on array returned by getIndexes() with indexes to select from inputChar Array to build the string
        {

            motif = motif+inputChars[j] ; 
        }

        context.write(new Text(motif), new Text ( minDistance(motif,value.toString()  ) ) ) ;
    }


}
// It takes a number , length of resultant indexes , number of unique characters 

/*
 * I convert the number to base of unique characters so the max index that can be generated will be less than the power
 * then place the number at end of indexes array which will keep the starting indexes to be 0
 * 
 * As our length is 8 and characters are 4 so
 *  if my number is 0
 *  i converted it to base 4 so it will remain 0
 *  i placed it at end of indexes array so my array will be like
 *  0 0 0 0 0 0 0 0
 *  which in our case if considered as index of inputChars it will return
 *  a a a a a a a a 
 *  The max number will be 8 ^ 4 = 65536 as we are starting from 0 our max number will be 65536
 *  in base 4 65536 is 3 3 3 3 3 3 3 3 which if we consider indexes will become
 *  t t t t t t t t 
 *  So every number from 0 to 65536 will be covered and each combination will be passed as key of mapper 
 *  
 */
int[] getIndexes (int number , int size , int characters )
{
    //init new result array
    int[] result = new int[size] ;
    // I stuck here
    }
    return result ;
}

//return concatinated string in format minDistance,bestMatching,index

2 个答案:

答案 0 :(得分:0)

/ in string  
        // 2 -> index -> 0 
        // 3 -> index -> 1 
        // 1 -> index -> 2 
        //Array has indexes 0 to 7 
        // so to add paadd 
        // 2 have to be at 5th index 
        // 3 have to be at 6th index 
        // 1 have to be at 7th index 
        // size variable has total required length that is 8 
        // so i subtracted length of 231 from 8. 8-3 = 5 + i = 5+0 
        // now i have to place second at 6th... so 8-3 = 5 + i = 5 + 1 
        // for 7th index 8 - 3 = 5 + i = 5 + 7 
        // i concatinated "" to make the character into string and Integer.parse int converted them to integer 
        // so 5 6 7 indexes will be filled with 2 3 1 

result [size-indexes.length()+ i] = Integer.parseInt(indexes.charAt(i)+&#34;&#34;);

答案 1 :(得分:0)

String [] tokens = values.iterator().next().toString().split(",");
    int minDistance = Integer.parseInt ( tokens [ 0 ] ) ; 
    String bestMatching = tokens[ 1 ] ; 
    int index = Integer.parseInt( tokens [ 2 ] ) ; 

    int minimumDistance = minDistance ; 


    for ( Text t : values ) 
    { 
        tokens = t.toString().split(","); 
        int distance = Integer.parseInt( tokens [ 0 ] ) ; 
        if ( distance < minDistance ) 
        { 
            minDistance = distance ; 
            bestMatching = tokens [ 1 ] ; 
            index  = Integer.parseInt( tokens [ 2 ] ) ; 
        } 
        minimumDistance = minimumDistance + distance ;