以致宏大，以致高远

2013年6月7日

个人博客

最近自己开了一个博客，方便以后管理，大家捧场。ewreBio

posted @ 2013-06-07 21:41 ewre 阅读(185) | 评论 (0) | 编辑收藏

2013年6月6日

awk脚本小tip

1，在BEGIN {}当中设置FS，OFS之类的内置变量值，这一点在输出tab deliminited file时有用

posted @ 2013-06-06 10:30 ewre 阅读(206) | 评论 (0) | 编辑收藏

2013年5月31日

shell各种unexpected error

1, unexpected end of line
可能是在win下编辑的shell脚本，换到shell执行时换行符不统一造成，办法就是把行末的\r\n替换成\n
2,unexpected "("
今天第一次遇到这个问题，脚本没有语法错误（bash脚本），后来在stackoverflow上发现原因，在ubuntu上我用默认的dash解释器执行了bash脚本，所以会出错，shebang行换成#！/bin/bash即可解决

posted @ 2013-05-31 09:32 ewre 阅读(455) | 评论 (0) | 编辑收藏

2013年5月30日

shell find

find path_to_find -name expressions

posted @ 2013-05-30 17:20 ewre 阅读(237) | 评论 (0) | 编辑收藏

2013年5月21日

shell中绝对路径相关的2个命令

1，realpath
2，basename

realpath 获取绝对路径，basename则获取路径的叶子节点名称

posted @ 2013-05-21 16:11 ewre 阅读(242) | 评论 (0) | 编辑收藏

aptitude

早上装apache的时候可能把以来关系搞坏了，用apt-get死活装不了，后来改用aptitude，一切顺利

posted @ 2013-05-21 11:59 ewre 阅读(223) | 评论 (0) | 编辑收藏

2013年5月18日

下载和使用NCBI的SRA数据

首先，对于加密的数据（dbGaP受限制数据）需要申请，申请后需要设置项目密码，下载数据，然后对下载后的数据进行dump，相当于解压缩，解压时
用到NCBI的sratools，对于拥有图形界面的系统来说，可以直接用sratoolkit。jar进行解密密码相关的设置，如果没有图形界面，则可使用2.3以前的旧版本，
利用configuration perl脚本进行设置，设置好以后解压，然后对序列数据进行align或者其他的工作，所以，还算比较繁琐的。

新下载的文件记得用configuration perl脚本检测external reference sequence文件，它会自动下载这些需要的文件。

posted @ 2013-05-18 14:46 ewre 阅读(4914) | 评论 (0) | 编辑收藏

2013年5月17日

使用aspera从NCBI下载大数据

ascp -h

Usage: ascp [OPTION] SRC... DEST

SRC to DEST, or multiple SRC to DEST dir

SRC, DEST format: [[user@]host:]PATH

-h,--help Display usage

-A,--version Display version.

-T Disable encryption

-d Create target directory

-p Preserve file timestamp

-q Disable progress display

-v Verbose mode

-6 IPv6

-D... Debug level

-l MAX-RATE Max transfer rate

-m MIN-RATE Min transfer rate

RATE: G/g(gig),M/m(meg),K/k(kilo)

-u USER-STRING User specific string

-i PRIVATE-FILE Private key file (.ppk)

-w DIRECTION Test bandwidth. DIRECTION: r,f

-K PROBE-RATE Probing rate for bandwidth measurement

-k RESUME-LEVEL Enable resume. RESUME-LEVEL: 0,1,2,3

-Z DGRAM-SIZE Manually set MTU

-g READ-SIZE Read block size.

-G WRITE-SIZE Write block size.

SIZE: K (kilo), M (meg), or just bytes

-L LOCAL-LOG-DIR Local logging dir

-R REMOTE-LOG-DIR Remote logging dir

-S REMOTE-ASCP Name of remote ascp command line

-e PRE-POST Pre and Post command file path

-O FASP-PORT UDP port used by FASP

-P SSH-PORT TCP port used by SSH

-C N-ID:N-COUNT Parallel transfer.

-E PATTERN Exclusion pattern. Repeat for more PATTERN.

-f CONFIG-FILE Specify alternate configuration file path

-W TOKEN-STRING Specify TOKEN-STRING for transfer

-@ RANGE-LOW:RANGE-HIGH Transfer only ranges within file

-X REXMSG-SIZE Size of retransmit request

--mode=MODE MODE: send, recv

--user=USERNAME

--host=HOSTNAME

--policy=TRANSFER_POLICY TRANSFER_POLICY: fixed,high,fair,low

--file-list=FILENAME

--file-pair-list=FILENAME

--source-prefix=PREFIX Prepend to each source path

--symbolic-links=METHOD METHOD: follow,copy,copy+force,skip

--remove-after-transfer Remove source files after they are transferred correctly

--remove-empty-directories Remove empty source subdirectories

--skip-special-files

--file-manifest=OUTPUT OUTPUT: text,none

--file-manifest-path=DIRECTORY

--file-manifest-inprogress-suffix=SUFFIX

--precalculate-job-size

--overwrite=METHOD METHOD: never,always,older,diff

--file-crypt=CRYPT CRYPT: encrypt,decrypt

--retry-timeout=SECS

--keepalive

--partial-file-suffix=SUFFIX

--src-base=NAME

--proxy=URL

--preserve-file-owner-uid

--preserve-file-owner-gid

HTTP fallback only options:

-y 0/1 1 = Allow HTTP fallback (default = 0)

-j 0/1 1 = Encode all HTTP transfers as JPEG files

-Y FILENAME HTTPS key file name

-I FILENAME HTTPS certificate file name

-t PORT HTTP fallback server port

-x PROXYSERVER-ADDR[:PORT] Proxy address and port (default 80)

注意k选项,选择2可实现续传。

posted @ 2013-05-17 10:42 ewre 阅读(785) | 评论 (0) | 编辑收藏

2013年5月15日

ubunt12.04 unity 快捷键

桌面自从换成unity以后ubuntu被各种褒贬，偶然发现一个小细节：按住super键（PC上一般是win键）不放，你会看到一个类似 help的快捷键列表，有点意思。

posted @ 2013-05-15 14:05 ewre 阅读(276) | 评论 (0) | 编辑收藏

2013年5月8日

vim点滴

1,查找替换：

　　查找替换匹配到的部分：%s/pattern/replace/g

　　查找删除被一个pattern匹配的行：g/pattern/d

2,重复上次的命令: .

3,正则：

　　行锚位符: ^ $

　　重复次数: \{m,n} , \{m}

　　元字符: * . (.*匹配anything)

　　特殊字符: \s空格 \+重复一至多次

4,游标控制：w e b ge(useful)

5,f char F char 在line里向后向前找char ；，重复最后一个f或者F动作

6,% 跳至对应的括号部分

7,排序，查重: sort , g/^$.*$$\n\1$/d

8, c (make change)命令和 d 命令一样可以接受 e w $,因为delete a line太常用，所以，dd（dd 而不是dl，因为按两次d更快）被用来执行这个功能。

9,p是put而非paste（随便你怎么理解），p不仅可以把yank的内容粘到一个地方，还可以把前一次d操作删除的东西put到一个地方。

10,文本位置的定位：G(通常按shift+g）去末行；gg去开头；linenum+G去第linenum行，/pattern搜索包含pattern的行，并跳至第一处匹配初，此时按n（next）或者N（previous）

11,还是对vim的查找替换做个总结：（参考http://vimregex.com/）
首先是命令的总揽：

*:range* s[ubstitute]/pattern/string/cgiI**
For each line in the range replace a match of the pattern with the string where:
c	Confirm each substitution
g	Replace all occurrences in the line (without g - only first).
i	Ignore case for the pattern.
I	Don't ignore case for the pattern.

range的格式：

Specifier	Description
*number*	an absolute line number
.	the current line
$	the last line in the file
%	the whole file. The same as 1,$
't	position of mark "t"
*/pattern[/]*	the next line where text "pattern" matches.
*?pattern[?]*	the previous line where text "pattern" matches
\/	the next line where the previously used search pattern matches
\?	the previous line where the previously used search pattern matches
\&	the next line where the previously used substitute pattern matches

然后是pattern相关：

先是锚位符：词锚位符：\<pattern_word\> 将精确匹配pattern_word单词
句锚位符：^whole_line$ 这个大家都比我熟悉

接下来是元字符和转义字符：

#	Matching	#	Matching
.	any character except new line
\s	whitespace character	\S	non-whitespace character
\d	digit	\D	non-digit
\x	hex digit	\X	non-hex digit
\o	octal digit	\O	non-octal digit
\h	head of word character (a,b,c...z,A,B,C...Z and _)	\H	non-head of word character
\p	printable character	\P	like \p, but excluding digits
\w	word character	\W	non-word character
\a	alphabetic character	\A	non-alphabetic character
\l	lowercase character	\L	non-lowercase character
\u	uppercase character	\U	non-uppercase character

再接下来是匹配的数量和贪婪性的控制符：

Quantifier	Description
*	matches 0 or more of the preceding characters, ranges or metacharacters .* matches everything including empty line
\+	matches 1 or more of the preceding characters...
\=	matches 0 or 1 more of the preceding characters...
\{n,m}	matches from n to m of the preceding characters...
\{n}	matches exactly n times of the preceding characters...
\{,m}	matches at most m (from 0 to m) of the preceding characters...
\{n,}	matches at least n of of the preceding characters...
where n and m are positive integers (>0)

Quantifier	Description
\{-}	matches 0 or more of the preceding atom, as few as possible
\{-n,m}	matches 1 or more of the preceding characters...
\{-n,}	matches at lease or more of the preceding characters...
\{-,m}	matches 1 or more of the preceding characters...
where n and m are positive integers (>0)

再来字符范围相关的描述符：【0-9a-zA-Z],这里有个取补集的符号:^

vi也有捕获字符的机制（grouping and backreference）：
形式： s:$word1$$word2$:action_use _\0\1..\9:

上图：

#	Meaning	#	Meaning
&	the whole matched pattern	\L	the following characters are made lowercase
\0	the whole matched pattern	\U	the following characters are made uppercase
\1	the matched pattern in the first pair of	\E	end of \U and \L
\2	the matched pattern in the second pair of	\e	end of \U and \L
...	...	\r	split line in two at this point
\9	the matched pattern in the ninth pair of	\l	next character made lowercase
~	the previous substitute string	\u	next character made uppercase

接下来是逻辑连接：\| 叫做alternation，表示或连接

最后是匹配和命令连接在一起：

:range g[lobal][!]/pattern/cmd
Execute the Ex command *cmd* (default ":p") on the lines within [*range] where patternmatches. If pattern* is preceded with a ! - only where match does not occur.

posted @ 2013-05-08 11:39 ewre 阅读(359) | 评论 (0) | 编辑收藏

2012年12月19日

小黄蜂工程测试

*#36#
工程测试主菜单包括：版本信息、背光、屏幕、马达、键盘、充电、麦克风、耳机、耳麦前摄像头、摄像头、触屏、重力感应器、接近感应器、光感应器、指南针、音频、视频*#0000#
直接进入版本信息*#8375#
直接进入版本信息*#8378#
终端标志*#87#
自动开始测试以下项目：音频、背光、屏幕、前摄像头、摄像头、充电、麦克风、耳机、耳麦、指南针等下边还有一堆。*#8929#
重置手机（恢复出厂设置）*#8924#
蓝牙测试*#477#
GPS测试*#1111#
WIFI设置*#8715#
紧急呼叫*#8555#
USB切换(这个应该是测试刷ap和bp时，usb相对应的状态)*#76#
序列号*#4568#
密钥校验*#8249#

posted @ 2012-12-19 12:51 ewre 阅读(263) | 评论 (0) | 编辑收藏

手机工程AB与BP

转自csdn
手机的AP和BP根据上下文可以指代硬件和软件两种意思.

1) 大多数的手机都含有两个处理器。操作系统、用户界面和应用程序都在Application Processor(AP)上执行，AP一般采用ARM芯片的CPU。而手机射频通讯控制软件，则运行在另一个分开的CPU上，这个CPU称为Baseband Processor(BP)。
把射频功能放在BP上执行的主要原因是：射频控制函数（信号调制、编码、射频位移等）都是高度时间相关的。最好的办法就是把这些函数放在一个主CPU上执行，并且这个主CPU是运行实时操作系统的。
另外一个使用BP的好处是一旦它被设计和认证为好了的，不管你采用的操作系统和应用软件怎么变化，它都可以正确的执行功能（它的通讯功能）。另外，操作系统和驱动的bug也不会导致设备发送灾难性的数据到移动网络中。（FCC要求的）
由于AP和BP是分开的设备，手机设计者可以更加自由的设计用户界面和应用软件。

2)手机开发商,比如摩托罗拉,会将开发的手机软件包分为AP和BP两部分, 运行在Application Processor(AP)的软件包称为AP包,包括操作系统、用户界面和应用程序等; 与Baseband Processor(BP)相关的软件包称为BP包, 包括baseband modem的通信控制软件等. 相应地, 所谓的刷新手机AP和BP文件即是将这两个软件包更新到手机上. 为方便刷机, 也有将AP,BP文件和flex文件(手机的参数配置文件)作在一起的一体包.

posted @ 2012-12-19 12:21 ewre 阅读(209) | 评论 (0) | 编辑收藏

2012年10月22日

cwm recovery compilation

Hi. I am creating this guide because i did not find any particular functional guide with details.

You must be running a 64 or 32 bit version of Ubuntu. Please note that i wont be going in the details on how to setup a build environment and sync sources as there are many guides for that.

Step 1 :
Install the required packages

Step 2 :
Setup the build environment and sync the sources for the required CWM. CWM source comes bundled with the CyanogenMod source.

Code:

CWM 5 - Gingerbread CWM 6 - Jellybean

Step 3 :
Now we come to the actuall compiling part. Make sure you have synced the latest source using the "repo sync" command.
Change directory to your source.
Issue this command :

Code:

make -j4 otatools

Step 3.5 :
Do this step if your device is not officially supported by CM10.
Using terminal emulator on your device, issue the command

Code:

dump_image boot /sdcard/boot.img

This will dump the boot image to your sdcard. Transfer it to your home directory.

To build Android from source for a new device, you need to set up a board config and its makefiles. This is generally a long and tedious process. Luckily, if you are only building recovery, it is a lot easier. From the root of your Android source directory (assuming you've run envsetup.sh), run the following (substituting names appropriately):

Code:

build/tools/device/mkvendor.sh device_manufacturer_name device_name /your/path/to/the/boot.img

For example if you are having the Samsung Galaxy Ace device, the command will go as follows :

Code:

build/tools/device/mkvendor.sh Samsung cooper ~/boot.img  Please note that Cooper is the device name. Only use "~/boot.img" if you have the boot image in your home directory. Or else please specify the correct path.

You will receive the confirmation "Done!" if everything worked. The mkvendor.sh script will also have created the following directory in your Android source tree:

manufacturer_name/device_name

Step 3.5 ends here.

Step 4 :
Now that you have the device config ready, proceed.
Type the following code in your terminal in the source directory.

Code:

. build/envsetup.sh

This will setup the build environment for you to work.

Now launch the command

Code:

lunch full_device_name-eng

This will set the build system up to build for your new device. Open up the directory in a file explorer or IDE. You should have the following files: AndroidBoard.mk, AndroidProducts.mk, BoardConfig.mk, device_.mk, kernel, system.prop, recovery.fstab, and vendorsetup.sh.

The two files you are interested in are recovery.fstab and kernel. The kernel in that directory is the stock one that was extracted from the boot.img that was provided earlier. For the most part, recovery.fstab will work on most devices that have mtd, emmc, or otherwise named partitions. But if not, recovery.fstab will need to be tweaked to support mounts and their mount points. For example, if your /sdcard mount is /dev/block/mmcblk1p1, you would need the following lines in your BoardConfig.mk

/sdcard vfat /dev/block/mmcblk1p1

Once the recovery.fstab has been properly setup, you can proceed to the next step.

Step 5 :
Now we build the actual recovery.

Code:

make -j4 recoveryimage

This command builds the recovery image

You can use the command

Code:

make -j4 recoveryzip

to make a fakeflash recovery i.e. a temporary recovery to test out on the actual device.

Your recovery can then be found at "your_source_directory/OUT/target/product/device/recovery.img" and the temporary fakeflash zip in the utilities folder at the same location.

If everything works out well, you will have a working recovery.

Once you have working builds, notify "koush", on Github and he can build official releases and add ROM Manager support!

------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Some tips :

If you want to compile CWM 6, sync the jellybean branch using the command :
Code:
```
repo init -u git://github.com/CyanogenMod/android.git -b jellybean  repo sync
```
If you want to compile CWM 6 on a 32 bit system, you need to sync THIS source too. Instructions are given in the readme.
Run "make clobber" between builds if you change the BoardConfig.mk, or the change will not get picked up.
http://forum.xda-developers.com/showthread.php?t=1866545

posted @ 2012-10-22 10:23 ewre 阅读(588) | 评论 (0) | 编辑收藏

2012年10月18日

循环中有random操作时注意

random相关的函数需要一个随机数种子，一般取为机器的时间转为int作为种子，但是这种做法存在一个问题：
如果是在循环内部使用random操作时，由于完成一次循环所需的时间可能会很短，这样就会造成上一次循环体执行时用的种子与这一次的
相同，就会产生一些问题。

posted @ 2012-10-18 09:44 ewre 阅读(363) | 评论 (0) | 编辑收藏

2012年9月12日

shapiro.test 的结果会受样本量大小的影响

shapiro.test的结果受样本量大小的影响，随着样本量的增加，p值会相应的减小。这样的话，容易给我们造成否定原假设的假象。

posted @ 2012-09-12 10:22 ewre 阅读(673) | 评论 (0) | 编辑收藏

2012年9月7日

关于R里面的regexpr与逃逸之间的关系

最典型的例子是：

strsplit(x,split="\\.")
paste(x,"csv",sep=".")
同样是以.做分割符，strsplit需要逃逸两次，而paste中则不许要逃逸，这是因为：
strsplit中split参数是一个正则表达式，paste中的sep参数则是一个字符。
至于为什么是"\\."而不是"\."，就要搞清楚：
\是R中的元字符也是regexpr的元字符，
.不是R的元字符但是regexpr的元字符，
所以，要匹配.本身，我们需要传递给regexpr解释器"\."
但是如果直接在R环境中传递，R会把\解释为逃逸符，所以要先逃逸掉R的解释就再逃逸一次：\\.

posted @ 2012-09-07 17:24 ewre 阅读(381) | 评论 (0) | 编辑收藏

2012年9月6日

sed这个东西是很bug

sed -n '/pattern1/,/pattern2/p' filename
可以输出夹在pattern1和pattern2之间的行

posted @ 2012-09-06 16:46 ewre 阅读(288) | 评论 (0) | 编辑收藏

2012年9月4日

ubuntu自带的资源管理器nautilus可以进行各种远程登录操作

以前不知道，以为鹦鹉螺只是一个简单的文件管理器，现在知道，它还可以进行各种远程登录包括ssh ftp等等，而且还可以建立书签，
这些省得我wine SSH了(3.4.2 file->connect to server->嗒嗒嗒)。
另外就是有个叫supertuxkar的游戏，在ubuntu上玩着还不错哦，顺便贴上这个opensource game list

posted @ 2012-09-04 14:43 ewre 阅读(479) | 评论 (0) | 编辑收藏

2012年9月3日

access NCBI database programitically-a snp case

有时候，我们需要大批量的查询ncbi数据库，这时，我们很可能需要用到ncbi为我门提供的程序化接口:Eutils。
Eutils有五大部分组成，具体的doc参见ncbi的online book。这里提供一个检索snp信息的例子：

use LWP::Simple;
use List::MoreUtils qw/ uniq /;

sub xmlparse{
    #parameter: 1,xmlstring 2,out file descriptor
    $snpfetchrst = @_[0];
    $outpfile = @_[1];
    #get rsID, validation info, allele info, gene info, maf allele info
    my $rsid;
    my $taxid;
    my $content;
    my @geneid;
    my $alleinfo;
    my $filtr = false;
    #print table head
    #print $outpfile "refSNP","\t","alleles","\t","minoralle","\t","entrezGeneID","\n";
    while ($snpfetchrst =~ /<Rs rsId=\"(\d+)\".*?taxId=\"(\d+)\">(.*?)<\/Rs>/sg) {
        @geneid = ();
        $filtr = false;
        $rsid = $1;
        $taxid = $2;
        $content = $3;
        if($taxid != 9606){
            next;
        }
           else{
            if (($content =~ /<Validation (.*?)\/>/sg)|($content =~ /<Validation .*?>(.*?)<\/Validation>/sg)) {
                $valid_state = $1;
                #print $valid_state,"\n";
                if($valid_state){    #validation state
                    $filtr = true;
                    if ($content =~ /<Sequence.*?>(.*?)<\/Sequence>/sg) {
                    $seqinfo = $1;
                    $alleinfo = $1 if ($seqinfo =~ /<Observed>(.*?)<\/Observed>/);
                    }
                          if ($content =~ /<Assembly.*?reference="true">(.*?)<\/Assembly>/sg){
                        $assemContent = $1;
                        if($assemContent =~ /<Component.*?>(.*?)<\/Component>/sg){
                            $compcontent = $1;
                            if($compcontent =~ /<MapLoc.*?>(.*?)<\/MapLoc>/sg){
                                $funcinfo = $1;
                                #print $funcinfo,"\n";
                                while($funcinfo =~ /<FxnSet geneId=\"(\d+)\".*?\/>/sg){
                                    #print $1,"\n";
                                    push(@geneid,$1);
                                }
                                @geneid = uniq(@geneid);    #require moreUtils module
                            }
                        }
                    }
                    if ($content =~ /<Frequency.*?allele=\"(.*?)\".*?\/>/sg){
                        $malle = $1;
                    }
                }
                      else{
                    next;
                }
               }
            else{
                $filtr = false;
                next;
            }
        }
        if($filtr) {print $outpfile ($rsid,"\t",$alleinfo,"\t",$malle,"\t",join(',',@geneid),"\n");}
    }
}    #end of xml_parse

my $utils = "http://www.ncbi.nlm.nih.gov/entrez/eutils";

open($refsnp,"<","snps.txt");

while (<$refsnp>) {
    chomp;
    $_ =~ /(\d+)/;
    $rsid = $1;
    push(@rss, $rsid);
}
close($refsnp);

my $id_list = join(',',@rss);
#print $id_list,"\n";

#assemble  the epost URL as an HTTP POST call
$url = $utils . "/epost.fcgi";

$url_params = "db=snp&id=$id_list";

#create HTTP user agent
$ua = new LWP::UserAgent;
$ua->agent("elink/1.0 " . $ua->agent);

#create HTTP request object
$req = new HTTP::Request POST => "$url";
$req->content_type('application/x-www-form-urlencoded');
$req->content("$url_params");

#post the HTTP request
$response = $ua->request($req);
$epostrst = $response->content;

my $querykey;
my $webEnv;

if($epostrst =~ /<QueryKey>(.*?)<\/QueryKey>/sg){
    $querykey = $1;
    #print $querykey,"\n";
}
if($epostrst =~ /<WebEnv>(.*?)<\/WebEnv>/sg){
    $webEnv = $1;
    #print $webEnv,"\n";
}
my $count = @rss;
my $retmax = 600;
my $efetch = $utils."/efetch.fcgi?";
open(my $outpf,">>","snp_queryinfo.txt");
if(defined($querykey)&&defined($webEnv)){
    for($retstart = 0;$retstart < $count;$retstart += $retmax){
        my $progress = $retstart/$count;
        my $prog = sprintf( "%.2f ",$progress);
        print "progression: $prog\n";
        $efetch_url = $efetch."db=snp&WebEnv=$webEnv&query_key=$querykey";
        $efetch_url .= "&retstart=$retstart&retmax=$retmax&retmode=xml";
        #print $efetch_url,"\n";
        $efetch_out = get($efetch_url);
        #print $efetch_out,"\n";
        xmlparse($efetch_out,$outpf);
    }
}
close($outpf);

上面的过程将读入名为snps.txt所存储的refsnp号，然后利用efetch检索数据库信息，最后从xml结果中抽提有用的信息。

posted @ 2012-09-03 16:49 ewre 阅读(436) | 评论 (0) | 编辑收藏

2012年8月27日

UCDAVIS关于functional enrichment的介绍

我觉得介绍有一点不合适的地方，就是那个列连表里面的数字，不应该是：

3-1 40
297 29960

而应该是：

3-1 37
297 29663

posted @ 2012-08-27 16:12 ewre 阅读(310) | 评论 (0) | 编辑收藏

2012年8月23日

R: function's local variable used as (the function's) parameter

codes:

test<-function(x=1,y=lv*3) {
lv<-1
x+y
}
test()
4
test(2)
5
test(1,1)
2

now have some change:

test<-function(x=1,y=lv*3) {
    cat(x+y,"\n")
    lv<-1
    x+y
}
test()
Error in cat(x + y, "\n") : object 'lv' not found

So, a function's local variable can be used to initialize the function's (default)parameter once that local variable has been initialized.

posted @ 2012-08-23 14:08 ewre 阅读(289) | 评论 (0) | 编辑收藏

2012年8月15日

enlightening thoughts on maths

1, induction rule(this has been widely used)

2, squeeze rule(used in sinx/x limit proof)

3, sequence convergence rule(sequence convergent proof,specially those sequences that have irrational limits like e)
this rule says that for a sequence An:
if An+1-An>=0, and, for every n, An<=some constant, then An is convergent.

posted @ 2012-08-15 10:32 ewre 阅读(345) | 评论 (0) | 编辑收藏

2012年8月13日

hardy weinberg eq and genotype frequency

1, assume that there are two alleles, from the individuals' phenotype constitution, we can get the allele proportion:
alleles: A a; constitution: AA-x Aa-y aa-z
A proportion: pro(A)=(2*x+1*y)/2*(x+y+z) #AA has 2 A,so one AA contributes 2 A.
a proportion: pro(a)=(2*z+1*y)/2*(x+y+z)
note that x y z can be any positive variables between zero and 1 provided x+y+z=1, in other word, these three variables has 2 freedom.
2, hardy-weinberg equilibrium
hardy-weinberg equilibrium says that ,without external disturbance, with random mating rate, the proportion of each allele, as well as the individuals' phenotype constitution remains the same generation after generation.
3, given the allele constitution of a population, the individual proportion under equilibrium state is fixed.
   thus calculate the difference between the expected(situtation under equilibrium state) constitution and the real constitution of the population
   can give us an implication that whether this population eveloted under the hardy weinberg assumption.
4, one more point:
   given allele frequency of A(x) and a(y) of a population, one should not use the multiply rule to get the proportion of genotype Aa(this is not independent events), one should use the multiply table instead.

posted @ 2012-08-13 15:52 ewre 阅读(281) | 评论 (0) | 编辑收藏

2012年7月22日

R coding hints

1, If you do not know how R internally convert the various data types, declear your variable name as explicitly as you can(variable type, character or numeric or boolen). see an example:
   let's say that you need a matrix data structure to store something(after a series of computation using "for" loop), usually,i done this by two means:
   I: state a vector(of whatever type,char, numeric etc) named mat_A, get a vector, vec_i, in every loop body and bind mat_A with vec_i to generate the matrix
   II:calculate the ncol and nrow of the matrix in advance, generate the matrix mat_A using matrix(ncol=m,nrow=n)->mat_A; set related part of the matrix in every loop body.
   The second method is strongly recommended. because you may make numeric data totally(and automatically) converted to character data without intention using the first method and this kind of error is hard to debug because you do not know much about that.

2, some function have specific behavior, be aware of and pay attention to these behavior.
   take the function aes(in ggplot2 package) as an example.
   aes map related value to the aresmatic operation. func info: aes(x ,y, ....more variables)
   what i want to put emphasis on is its behavor: aes look for two places to map x y variables, one is the global enviroment, the other is the
   dataframe(colnames(df)) u provide to him. So, there will be problems when you enclosure this tiny function in your own function without proper prcession(didn't pass him the dataframe u want to tackle or unproperly place the variable in global enviroment in ahead). example code is as follows:

colscatterplot<-function(commonExp){
    require(ggplot2)
    commonExp<-as.data.frame(commonExp)
    #every possible combinations
    cmbn<-combn(ncol(commonExp),2)
    #cat(ncol(cmbn)," ",nrow(cmbn),"\n")
    for(i in 1: ncol(cmbn))
    {
        colnma<-colnames(commonExp)[cmbn[,i][1]]
        cat(colnma,"\n")
        colnmb<-colnames(commonExp)[cmbn[,i][2]]
        cat(colnmb,"\n")
        m = lm(as.formula(paste(colnmb ,colnma,sep="~")), commonExp)
        eq <- substitute(italic(y)==b%*%italic(x)+a*~~italic(r)^2==r2,list(a = format(coef(m)[1], digits = 2), b = format(coef(m)[2], digits = 2),r2 = format(summary(m)$r.squared, digits = 3)))
        eqlbel<-as.character(as.expression(eq))
        fnm<-paste(colnma,"-",colnmb,".png",sep="")
        cat(fnm,"\n")
        png(filename=fnm)
        p.obj<-ggplot(data=commonExp, aes_string(x=colnma,y=colnmb)) + geom_point() + geom_smooth(method=lm) + geom_text(aes(label=eqlbel) parse = TRUE)
        print(p.obj)
        #ggsave(filename=fnm,plot=p.obj)
        dev.off()
    }
    }

the funtion above will not act properly, as aes in gem_text will look for a global variable named eqlbel and assigns not the local eqlabel in this function but the golbal one to its label param, if there isn't a global variabe named eqlbel,the function exit with an error,but it will become a magic bug if there happen to be a variable named eqlbel. this kind of error can trap your debugging procedure for a long time without any progression.

posted @ 2012-07-22 20:39 ewre 阅读(280) | 评论 (0) | 编辑收藏

2012年7月13日

c里面的一些陷阱

1,c中没有字符串的概念，字符串以字符数组+\0的形式实现。然而，strlen函数在计算长度时并不考虑最后这个\0,这就为我们操作
malloc函数埋下了隐患，需要特别注意。
2,free函数释放内存也许并不象我们想象的那样，最好在free(ptr)之后加上ptr=NULL。
3,memory leak 是任何时候都要努力防止的。

posted @ 2012-07-13 14:25 ewre 阅读(238) | 评论 (0) | 编辑收藏

2012年6月25日

codesoucery交叉编译工具安装记录

交叉编译工具： codesourcery lite arm-2009q1-203-arm-none-linux-gnueabi
环境：ubuntu11.04 x64

第一，最好在root用户下安装，不然有各种问题
ctrl+T然后su输入passwd然后把下载下来的bin文件打开，java写的安装工具，一路next，安装完成
第二，ctrl+T开终端，export工具链安装位置到PATH，我的是/root/Codesourcery/Sourcery_G++_Lite/bin
export CROSS_COMPILE=arm-none-linux-gnueabi-
export ARCH=arm
第三，make你的kernel

另外，记得装uboot-mkimage如果你想用uboot引导的话

Here we record some issues which is related with android source compilation:
1,about repo: first you get curl using apt-get,then use curl to get repo, what is to mention is that
http://android.git.kernel.org
can not be used to get repo anymore, try this one:

https://dl-ssl.google.com/dl/googlesource/git-repo/repo

posted @ 2012-06-25 17:26 ewre 阅读(441) | 评论 (0) | 编辑收藏

2012年5月4日

再说一沙一世界

前些年的时候写过一个抽象，粒度的感想。现在觉得挺有意义，就此写下。

现实世界是无限的，既包括时无限也包括空无限，我不很了解广义相对论，所以不敢说前者与后者

是否等价，所以就都写下。

我们在做事尤其是发展理论时需要将现实抽象成一些我们熟悉的对象，这样做的理由很明显是：

以对象或对象间的行为属性来模拟现实世界中的某一范围内的现象，也即用理论预测实际。

如果这样，理论的成功与否在于抽象与现实在特定情况下的符合程度，这种特定情况即是我们的研究

所关心的情况或者是我们的理论的适用范围，普适的规律不存在研究价值。

解铃还须系铃人，检验这种符合程度的途径只有实验。但结果由过程决定，过程中的一个重要环节是

抽象的粒度的选择，也是抽象的大部分工作内容。

我现在的具体工作是：计算机辅助的系统的分子水平生物过程机制研究。其实主要是一些疾病

过程机制。

工作过程中，我的感觉是，即使很短的现实过程，抽象得到的对象也可以无限长，很小的现实体，抽象

对象也可以无限大，物质无限可分在逻辑上和现实上都还没有遇到矛盾。

左右抽象结果规模的大小的因素是抽象粒度。明智的做法是：粒化程度只要能够反映我们对现实的认知

程度就可以了，这是完美主义的要求，进一步粒化程度只要能够使得我们的问题模型运转就可以了，这

是实用主义的要求。

我们平常所说的机制过程是：用更细粒度的抽象模型结合逻辑体系合理演绎较高层次的粒度模型，机制

过程的演绎都发生在认知当中，不属于现实世界，所以科学研究中存在验证，也即实验。

这种意义上，我们就像赌徒，现实是大转盘，我们一次次的试图猜中色子的落点，并在猜测的过程中不

断地寻找并修正我们认为的可以依循的不变的东西。

posted @ 2012-05-04 10:55 ewre 阅读(294) | 评论 (0) | 编辑收藏

2012年4月25日

静态数组元素个数limit

今儿试了一下，静态数组其元素个数是有上限的，动态分配或者全局形式的数组
貌似不存在这个现象，据说和编译器设置以及堆栈有关系，栈控件有限，堆相对大一些。

posted @ 2012-04-25 12:45 ewre 阅读(311) | 评论 (0) | 编辑收藏

2012年4月16日

一些小技巧

1，取余运算用于循环分组
我说的循环分组是这种情形：
假设你循环读入每行文本，并对每4行文本做一次统一的处理，那么怎么实现呢？
创建4个用于存储每行文本的变量
将行数对4取余，switch余数，存入相应变量，当余数为零时即可进行统一处理

这里的提示是：取余运算可以根据余数将连续自然数分成个数为i（对i取余）的若干组

posted @ 2012-04-16 13:01 ewre 阅读(218) | 评论 (0) | 编辑收藏

2012年4月1日

概率处理上的错误想法

venn diagram and possibility
现在意识到，以前在概率处理上的最大错误在于：
潜意识的认为处理的事件间是独立的。
另外：
相互独立和venn diagram中面积的交叠没有直接关系，
venn diagram中的面积是事件所包含的元素的形象的表示。
相互独立不代表没有共同的元素。

discrete and continous
离散和连续一直是一个想不明白的地方，更确切的说是有限和无限。
关于离散和连续的一个的别扭的地方是：
连续变量在某个具体的点处的p为0。现在觉得造成这种现象的原因是
在离散变量和连续变量的定义或者区分上没有认识清楚。其实连续型
的变量很简单：包含不可数或者无限个状态的变量，其它的我们不关心。

posted @ 2012-04-01 10:37 ewre 阅读(217) | 评论 (0) | 编辑收藏

2012年3月12日

平均值中位数在直方图上的关系

首先，应当指出：
平均值是使得直方图上以平均值为界的两侧的加权面积相等的那个点，权重等于相应的面积
块的中心到该均值的距离；
以中位数为界，直方图左右两侧面积相等；

所以对于右长尾，对称，左长尾的分布，有：
右长尾(right skewed)：中位数更靠近数据分布区间中心左侧，均值在中位数右侧
对称：略
左长尾(left skewed)：中位数更靠近数据分布区间中心右侧，均值在中位数左侧

造成均值与中位数这种关系的原因在于均值位置的“加权”性质。

均值容易受极端值（少数的极大或者极小的值）的影响，中值没有这种性质。

posted @ 2012-03-12 12:56 ewre 阅读(1386) | 评论 (0) | 编辑收藏

2012年3月8日

c++头文件包含注意

c++头文件包含在一开始一定设计好，不要出现相互包含的情况。
如果一不小心出现了相互包含，那么在有条件编译的前提下会出现各种怪现象，
xxxx undefined就是一个。
c++ primer上说：前向定义可以利用在这种情况相互包含的情况下。

posted @ 2012-03-08 15:34 ewre 阅读(264) | 评论 (0) | 编辑收藏

2012年3月5日

R 代码优化

最近在优化一个R实现的功能网络分析方法，有以下体会：

1：R中，利用booler值进行矩阵下标索引可以帮助我们极大地提高程序效率：大部分作用于matrix的for while循环
都可转化成矩阵的booler值索引相关运算。例如：
矩阵A有10000行，2列。若用for进行相应行的前后置换：

A<-matrix(c(sample(1:10000,10000),sample(1:10000,10000)),ncol=2,byrow=T)
B<-A
proc.time()->t1
for(i in 1:nrow(A)){
if(A[i,1]>A[i,2]) A[i,]<-A[i,c(2,1)]}
proc.time()->t2
t2-t1

用户系统流逝
0.13 0.02 0.14

若写成：

proc.time()->t1

B[B[,1]>B[,2],]<-B[B[,1]>B[,2],c(2,1)]

proc.time()->t2

t2-t1

用户系统流逝
0.06 0.00 0.06

10000行时两种方法都很快，差别也只有0.08秒，但是当行数为100000时，差别就会达到0.5秒

2：对矩阵的操作尽量用apply家族函数。

3：若无必要，勿增实体----奥卡姆剃刀法则在任何地方都是你要考虑到的。

另外有一个剔除重复二元相互作用对的方法，记下来：
假设二元相互作用对存储在matrix A中，二元相互作用对是无序的（a<->b,b<->a同时存在）.
首先，按A的第一列排序整个matrix:
A[order(A[,1]),]->A
然后去重：
A<-unique(A)
然后用一个统一的规则(大于或小于都可以)重新放置每个相互作用对：

A[A[,1]>A[,2],]<-A[A[,1]>A[,2],c(2,1)] #if(A[i,1]<A[i,2]) A[i,]<-A[i,c(2,1)]} is OK

再unique一次：
A<-unique(A)
就可以达到效果。
4：注意R中的逻辑运算符:
R 中存在两类逻辑运算符：短形式运算符& | !和长形式运算符&& ||，具体说明见help
我们一般需要的是短形式的运算符。

posted @ 2012-03-05 15:20 ewre 阅读(475) | 评论 (0) | 编辑收藏

仅列出标题下一页

以致宏大，以致高远

导航

留言簿(2)

文章分类

文章档案

最新评论

阅读排行榜