I can believe fly.

Monday, May 6, 2013

Hudson使用指导--了解进阶篇

一、 hudson了解进阶篇

hudson是什么?

hudson是一款持续集成工具.它具备以下功能点:
1. 易于安装部署
2. 友好的脚本配置界面
3. 支持分布式构建
4. 远程监控外部定时任务
5. 插件方便管理
6. 支持用户管理
7. 支持构建队列控制
8. 集成RSS/E-mail/IM-通过RSS发布构建结果或当构建失败时通过e-mail实时通知。
9. 生成JUnit/TestNG测试报告

如何部署安装?

首先,部署服务器需要jdk环境支持,其次可以通过以下有两种方式部署:
1. 运行命令行:java -jar hudson.war
2. 通过tomcat部署:进入Tomcat Manager,找到war file to deploy部署hudson.war
部署好后访问形如http://192.168.1.1/hudson ,web平台如下:

Tuesday, January 3, 2012

win7登陆黑屏问题

现象:

登陆后黑屏 ctrl+alt+del 有用,启动任务管理器没用,安全模式正常,explorer不知出了什么毛病?

分析:

1. 在安全模式下,关掉启动项,问题仍旧

2. 在安全模式下,查毒(没网上说 tlntsvi.exe ),问题仍旧

3. 在安全模式下,重装显卡驱动,问题仍旧(估计没有卸载干净)

4. 在安全模式下,卸载可疑软件,问题仍旧

5. 在安全模式下,创建新账号尝试,问题仍旧

6. 看到一篇资料,地址http://support.microsoft.com/kb/929135。执行干净启动来查找问题,看上去挺靠谱的。

1) 首先用了诊断启动,可以正常登陆。进去后有选择的启动,排查服务,

2) 结果发现后面有50项服务同时启用,问题就仍旧存在。估计里面含有显卡驱动(这是后面猜测的),这又让我重新回到分析显卡驱动。

7. 继续折腾,进安全模式下卸载显卡驱动,这回先不安装,可以正常登陆。装上去就不正常了。

8. 把硬件管理员叫来,换了个显卡,问题仍旧。

就这样分析了一天,尽是找不到原因,实在是让人崩溃。原本排斥没找到原因就重装系统的心开始动摇了,突然又很不死心的跑到机器面前,打开“控制面板”->“程序和功能”,就想着看谁不顺眼就卸谁:

因为近期有升级YY,内测版,可能不稳定吧,先卸了吧;

还有dxsdk_apr2006,这个已经装两三个星期了,已经没用了,也卸了吧;

QQ电脑管家,好像之前也被我卸载了;

结果,电脑鬼始神差的好起来了。

结论:

不装显卡驱动有 SDK(2006)正常,有显卡驱动没SDK(2006)正常,有显卡驱动有SDK(2006)就不正常了。

处理:

卸载dxsdk_apr2006,就可以正常登陆。

后序:

早知道,要一个个的卸,一下子几个,很容易搞乱了谁跟谁有冲突。

为了搞清楚, 我就先装上YY,然后重启进入,em,很正常。

再次装了SDK(2006),哦,no,又进不去了。

好吧,只能继续去安全模式把它给卸了。

不解的是,SDK(2006)已装了几星期,为什么突然间发作呢?

这又让人作什么解释呢?

Tuesday, July 12, 2011

[转]Getting Control of Third Party Libraries

How are people reigning in 3rd party libraries? Selecting which pakcages are suitable for use, which licenses? How does your Software Process identify when someone uses un-approved software/licenses?

   1. Martin Hache (Senior Technical Java J2EE Consultant at HP)
      Terrific question, I'm surprised no one's responded in the 24 hours since it's been posted. I'll tell you what we did but I would not exactly call it a solution, also, I found the problem to be larger than what you described in your post.

      Before I relate my experience, I would suggest that you look at Maven2, I hear that it has a system to manage library dependencies; I've never used it but those who love it, really love it.

      The problem for us (a web development team of 15 or so individuals working on a few dozen web apps) was not only 3rd party libraries but also the 3rd party libraries that 3rd party libraries used (4th party?). E.g. When the Spring JAR uses some Apache Common's JARS. The versions of these libraries would clash with the versions our applications wanted to use. This was particularly apparent in our own reusable components which could be shared across several applications. These amounted to 3rd party libraries with 3rd party libraries of their own.

      Unfortunately, we never did crack this nut to my satisfaction, we settled for establishing a few guidelines. We added version numbers to JARs if they didn't have them (so spring.jar became spring-2.0.2.jar) this allowed us to ID the version of a library with a simple look. On top of that we basically leveraged our build order: components distributed the jars they needed to compile to the child apps/components that depended on them. Those dependents modules didn't usually contain the JARs if one of the components they depended on distributed it. If a child app/component needed a newer JAR than what the parent component was offering then we would create a new version of the parent with the new version of the JAR. Clashes were handled manually, by talking through them.

      Your question may have more to do with licensing and authority to use a library, but we didn't do much of that, again a manual vetting of the Jar and licensing is what we did.

   2. Curt Yanko (Sr IS Architect at UHG)
      I have left it *open-ended* since I'm fishing a bit here and don't want to influence the answers too much.
      I am indeed talking about the full monty as-it-were, and am interested in creating a Definitive Software LIbrary of *approved* components and then contraining the build system to just those. Additionaly, failing a build should I see a license that scares me, I'm looking at you Affero!
      Maven is indeed central to our strategy. Site reports and their BOM's play a key role in at least getting visibility. Now I want control.
     
   3. Ben Weatherall (Configuration Manager at PDX, Inc.)
      First, I want to point out a commercial solution from OpenLogic called OLEX. It does the license analysis (are these licenses compatible?) and license obligations (if you use this, in this fashion, you must do ...). It also has functionality to scan both source and binaries to determine which FOSS components are included, whether you intended for them to be or not. And no, I am not associated with OpenLogic - just paranoid enough to be checking them out.

      Now, as to what we actually (try to) do, whenever someone decides to add a third party component to the mix, they are required to submit links to the "owner" and to wherever they acquired the binaries. They are then allowed to commit to the primary repository only those binaries that are actually built from the component source. All of the other dependent components that may be supplied "for convenience" must be checked that they are available from an "owner" and then the process recurses.

      I enforce this, otherwise no license checks would ever be done and there would be no way I could "escrow" the source in case of legal challenge. Periodically, the architecture team, product managers and CM get together to review changes to our third party repository and how they are being used. We try to distribute the work since it is very time intensive to track everything back to the "owner" level and verify it.

      Curt, we do not use Maven for the very reason you like it - it will find what it needs even if I don't know about it. I am not totally happy with our solution since it is so manually intensive, but it is what I have today. It will evolve. Either that or I need to revoke Development's right to update third party components and allocate that function to an already overloaded team.

      We get our BOMs, etc. from a use of AccuRev, cvs and AnthillPro. The combination has kept me out of too much trouble so far.


--
Elian
 
Configuration Manage Engineer
MSN: smallfish961@hotmail.com
Email: smallfish382+work@gmail.com

Thursday, January 13, 2011

Hudson报错:java.lang.OutOfMemoryError: Java heap space

Hudson报错:java.lang.OutOfMemoryError: Java heap space

 

问题: 触发hudson构建,个别仓库会出现如下的错误信息:

hudson.util.IOException2: remote file operation failed: E:\WORK_DATA\Build\Build_Src\project\project1 at hudson.remoting.Channel@1a0c382:239.161

   at hudson.FilePath.act(FilePath.java:753)

   at hudson.FilePath.act(FilePath.java:735)

   at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:653)

   at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:601)

   at hudson.model.AbstractProject.checkout(AbstractProject.java:1038)

   at hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:481)

   at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:413)

   at hudson.model.Run.run(Run.java:1259)

   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:47)

   at hudson.model.ResourceController.execute(ResourceController.java:95)

   at hudson.model.Executor.run(Executor.java:129)

Caused by: java.io.IOException: Remote call on 239.161 failed

   at hudson.remoting.Channel.call(Channel.java:573)

   at hudson.FilePath.act(FilePath.java:744)

   ... 10 more

Caused by: java.lang.OutOfMemoryError: Java heap space

   at java.util.Arrays.copyOfRange(Unknown Source)

   at java.lang.String.<init>(Unknown Source)

分析测试:

1.        "java.lang.OutOfMemoryError: Java heap space"

有很多资料显示此错误信息跟Tomcat内存大小的设置有关,但实际作了调整后问题并没得到解决。也尝试增加了物理内存,由原来2G提升为4G,并更换操作系统为64位的。即64win2008+jdk1.6.0_22+tomcat-6.0.29+hudson1.386,以下为测试现象:

1)        第一天借了两条2G的内存,装上了64/win2008的系统;对系统update,装了些软件,搭建了tomcat服务.

2)        第二天准备远程连接操作,发现机子挂了.到机房一看,机子自动重启,并报错:注册表失败,需要修复的信息.

3)        打算通过光盘启动来做系统修复,结果通过光盘加载完就出现蓝屏.---这现象,开始怀疑跟那两条2G的内存有关系..硬件管理员帮忙试验了一下内存。后面发现其实是有两个系统存在.奈闷,昨天重装系统时,只读到一个1T的硬盘.这问题先不纠结了,把有效的系统盘设为优先启动,正常进入.

4)        系统正常启动后,远程操作时,结果每隔几十分钟就会自动重启.根本无法正常用.

5)        继续怀疑内存不兼容的.到机房,把旧的内存撤掉,留下那两条2G,发现蓝屏启动不起来.只留任一条,也是如此现象.后面,直接更换了内存,换了两条新,又再次正常启动.

6)        系统正常启动后,远程操作时,突然发现机子还是每几十分钟后就会自动重启. 在中午12点到2点的时间大约重启了三四回.2点后,又进机房关察了一下,确没发现有什么异常现象.

7)        继续操作,通过startup.bat启动的Tomcat服务,跑了一下任务,结果发现过一会系统自动把startup.bat给关了.在持续一会,机子又出现了自动重启的现象.等进去都会弹出一个对话框:windows已从异常关机中恢复.

8)        撤掉两条旧的1G宽内存,保留一条宽一条窄的2G内存,自动重启现象仍存在

9)        保留两条旧的1G宽内存,撤掉一条宽一条窄的2G内存,系统运行正常

10)     保留两条旧的1G宽内存+一条宽的2内存,系统运行正常

2.        hudson.scm.SubversionSCM.checkout

1)        试着把job放在Master上跑,结果一切正常。如果换到指定Slave机上,每次执行到取源代码就报错了.难道问题出在MasterSlave的交互??

2)        继续试着把Source Code Management的配置去掉,直接调用slave上的nant脚本取代码动作,一切运行正常的??这不禁让我认为跟hudson自带的SVN功能有问题?但比较疑惑的是有些SVN仓库有这样的现象,有些仓库又不会。出问题的仓库有个显著区别是添加比较多的svn:externals属性,难道这跟externals设了太多也有关系?

结论:

经过一番测试后,最简便的调整Tomcat内存大小并不是解决问题的办法,但不管怎么说这个处理是应该优先考虑的。目前这个问题的本质原因可能跟hudson自带的SVN有关。现能解决的是对存在问题的job不使用Source Code Management配置,直接用后台脚本替代。  

后序:

1.       关于JAVA_OPTS的设置

在调整TOMCAT内存大小中,几回设置JAVA_OPTS值才发现,-XmxMaxPermSize两个值是相铺相成的,以下几组是物理内存为2G的配置,请参考:

set JAVA_OPTS=-Xms1500m -Xmx1500m -XX:PermSize=128M -XX:MaxPermSize=256M ,此配置无效

set JAVA_OPTS=-Xms1300m -Xmx1300m -XX:PermSize=128M -XX:MaxPermSize=256M ,此配置OK

set JAVA_OPTS=-Xms1500m -Xmx1500m -XX:PermSize=64M -XX:MaxPermSize=128M OK,此配置OK

set JAVA_OPTS=-Xms1500m -Xmx1500m -XX:PermSize=64M -XX:MaxPermSize=150M OK,此配置OK

2.       问题反馈官方地址

http://issues.hudson-ci.org/browse/HUDSON-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


--
Elian
 
Configuration Manage Engineer
MSN: smallfish961@hotmail.com
Email: smallfish382+work@gmail.com

Tuesday, December 28, 2010

使用subprocess.Popen,程序block问题处理

使用subprocess.Popen,程序block问题处理

问题:

 脚本运行中,执行sudo strace svnadmin,出现卡死现象,一直保持如下信息不动:

 getcwd("/home/ysl/svn-repo-bak"..., 4098) = 28

 write(2, "* Dumped revision 2769.\n", 24

分析:

jessinio: jobs有什么呢

: 没东西

jessinio: 终端是不是和你运行SvnDumpBak.py是同一个?

: 不是,我又另外打开了一个

jessinio: 呃。。。车子不同, 你查个刹车 没有意义

: 那我要在执行命令那里,ctrl z掉,让它在后台跑?

jessinio: ctrl z 然后bg

: [ysl@svn-repo-bak]$ bg

[1]+ sudo python SvnDumpBak.py yslProR &

[[ysl@svn-repo-bak]]$ jobs

[1]+ Running sudo python SvnDumpBak.py yslProR &

 

jessinio: ps auxwww|grep strace

: [[ysl@svn-repo-bak]]$ ps auxwww|grep strace

ysl 10851 0.0 0.0 61144 728 pts/0 S+ 14:45 0:00 grep strace

root 31108 2.2 0.0 4112 632 pts/1 S+ 14:31 0:17 strace -p 17686

jessinio: 还有一个strace

: 我打另一个连接,在跑strace来看,刚还有信息在更新。现就卡着了

jessinio: 你把strace停了

: ctrl c??

: 停了

 

jessinio: ps auxwww|grep 7z

: root 17687 85.5 4.9 233772 199576 pts/0 Sl 14:15 27:37 /usr/local/p7zip_9.13/bin/7z a -si /data/backup/svnRepo//20101227/yslProR.full20101227.dump -pNBc5RB!

jessinio: strace -p 17687

: [[ysl@svn-repo-bak]]$ sudo strace -p 17687

Process 17687 attached - interrupt to quit

[ Process PID=17687 runs in 32 bit mode. ]

futex(0x86b54ac, FUTEX_WAIT_PRIVATE, 1366961, NULL

jessinio: top -p 17687

: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

17687 root 15 0 228m 194m 1856 S 0.0 4.9 27:37.96 7z

jessinio: 变化 不?

: TIME+下的数据没变啊

jessinio: 还是没有变?

: 没,还是 27:37.96

jessinio: ps auxwww|grep svnadmin

: root 17686 2.4 2.5 205868 103152 pts/0 S 14:15 1:10 svnadmin dump /storage/repool/yslProR -r 0:4855

jessinio: 你的代码应该有问题。

: 可是小的仓库为什么没问题呢

jessinio: buffer没有满呀

结论:

    同事帮忙分析是由于buffer满导致,检查了程序,原版:

    p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

    p.wait()

    # command exec error,print detail info

    if p.returncode:     

        write_log(p.stdout.read())

    return p.returncode

    也就是说没有处理输出数据,而pipe容量有限的。如果满了将会block或者fail。信息如下:

Pipe Capacity

      A pipe has a limited capacity.  If the pipe is full, then a write(2) will block or fail, depending on whether the  O_NONBLOCK

      flag  is  set  (see  below).

解决:调整脚本,增加了一个变量来读取PIPE里的输出数据

    p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

    retval = p.stdout.read()

    p.wait()

    # command exec error,print detail info

    if p.returncode:     

        write_log(retval)

    return p.returncode

资料:   

    subprocess模块:http://docs.python.org/library/subprocess.html


--
Elian
 
Configuration Manage Engineer
MSN: smallfish961@hotmail.com
Email: smallfish382+work@gmail.com

svnadmin dump | 7z命令执行失败

svnadmin dump | 7z命令执行失败

问题:

7-Zip 9.13 beta  Copyright (c) 1999-2010 Igor Pavlov  2010-04-15

p7zip Version 9.13 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)

Compressing  [Content]

System error:

Operation not permitted

svnadmin: Can't write to stream: Broken pipe

疑问:为什么小仓库dump,7z压缩正常,而大的仓库就会报类似以上的错误?

分析:

1. 执行mount,获取分区信息

2. 执行df -h,获取硬盘大小信息

3. 检查引发错误的的执行命令行是否有问题

svnadmin dump /storage/yslProR -r 0:1000 | /usr/local/p7zip_9.13/bin/7z a -si /data/space/svndumpbak/20101229/yslProR.full20101229.dump -p2048密码

结论:

   1.问题是由于空间不足引发的,其中/data目录的空间占用100%,信息如下:

     /dev/mapper/VolGroupData-LogVolData   10G   10G   74M 100% /data

   2.而引发空间不足,又是由于/data/space的空间没mount成功导致的。

解决:

    1.重新mount备份存储空间

      mount -t nfs 192.168.5.5:/data/space/ /data/space/

    2.为避免机子重启,mount空间失效,则请将以上mount命令加入/etc/rc.local


--
Elian
 
Configuration Manage Engineer
MSN: smallfish961@hotmail.com
Email: smallfish382+work@gmail.com