作者归档：℃冻番茄

在ubuntu12.04部署hadoop1.0.3单机环境

这几天刚接触Hadoop，在学习如何搭建一个Hadoop集群。在这分享一下，最新版的 ubuntu12.04 + hadoop1.0.3

hadoop下载【renren的国内镜像，速度比较快】
http://labs.renren.com/apache-mirror//hadoop/core/

ubuntu12.04（64bit）安装java运行环境
sudo apt-get install openjdk-6-jdk
最终安装位置为 /usr/lib/jvm/java-6-openjdk-amd64 [64位，如果不是64位系统，请进目录查找]

下载hadoop后，解压到/home 目录下

hadoop的目录为 /home/hadoop-1.0.3

修改一下hadoop的目录拥有者（xzy为指定的linux用户账号）
sudo chown xzy:xzy hadoop-1.0.3 -R

cd hadoop-1.0.3

vim conf/hadoop-env.sh
找到下面一行，去除注释，把jdk的路径写上
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64

验证hadoop是否安装成功
bin/hadoop version
Hadoop 1.0.3
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192
Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be

最后一步，运行一个demo任务WordCount
mkdir input
cp conf/* input
bin/hadoop jar hadoop-examples-1.0.3.jar wordcount input output
cat output/*

shell编写简单的守护进程

[codesyntax lang=”bash”]

#!/bin/sh
PRO_PATH=”/home/sh”
PROGRAM=”rediscache.sh”

while true ; do
PRO_NOW=`ps aux | grep $PROGRAM | grep -v grep | wc -l`

if [ $PRO_NOW -lt 3 ]; then
#echo “exec $PROGRAM”
$PRO_PATH/$PROGRAM 2>/dev/null 1>&2 &
fi

PRO_STAT=`ps aux|grep $PROGRAM |grep T|grep -v grep|wc -l`

if [ $PRO_STAT -gt 0 ] ; then
killall -9 $PROGRAM
./$PROGRAM 2>/dev/null 1>&2 &
fi
sleep 2
done
exit 0

[/codesyntax]

基于sql实现redis主动缓存

主要功能点

主动缓存，无异常情况下只从redis中取数据，不走mysql
列表的分类、排序
缓存数据中区分正在进行、未开始、已结束
分析sql语句触发redis缓存系统中数据的更新
对关联数据进行同步更新（例广告中的数据里面有商品信息，当商品发生改变时，触发把广告中的相关商品数据内容也进行同步更新）
针对单个表的缓存数据重建功能

继续阅读 →

nginx配置PATH_INFO并rewrite掉中间的index.php

最近看博客统计数据，还是有很多搜索nginx 配置 PATH_INFO 和去掉中间的index.php的关键词，所以把目前自己在用的配置传一份到网上

server
{
listen 80;
server_name ye55.dev;
index index.html index.htm index.php default.html default.htm default.php;
root /home/www/ye55/trunk/www/;

if (!-f $request_filename) {
rewrite ^/(.*)$ /index.php/$1 last;
}
location ~ .*\.php(.*)$
{
fastcgi_pass unix:/tmp/php-cgi.sock;
fastcgi_index index.php;
include fcgi.conf;
fastcgi_split_path_info ^(.+\.php)(.*)$;
include fastcgi_params;
fastcgi_param PATH_INFO $fastcgi_path_info;
}
}

基于分析sql语句实现主动缓存

以前在上海的时候，做过一个用redis实现主动列表缓存的方案，但那时生产环境用的是1.0，所以那个列表缓存非常粗糙，只是满足需求而以，但运行得比较稳定，因此redis在实现上是可行的。

加上最近在看redis方面的资料，从头系统过了一遍redis,所以有了改进以前做的主动列表缓存的冲动，计划首先用在现在公司的一个新项目上，如果可行，就可以继续部署到老项目，改进性能。

要实现主动缓存，主要的问题在于以下几点

1. 怎么触发更新、删除、插入数据库时，同步更新redis里的数据

2. redis中数据的存储采用怎么的方式

3. 主动缓存中怎么排序和分类

4. redis意外停止服务的情况下，如果正常提供列表服务

5. 列表缓存应该工作在哪一层，dao ? service ?

6. redis中单个数据失效的情况下怎么剔除

7. 如果减少网络请求，尽量少的命令获取一个分布的数据

目前方案正在设想中，先写下些东西，做下记录，后面再逐步完善！

一，解决mysql数据改变时触发实时更新redis数据，并最少改动现有代码

所以想在加一层 cache 层，使用cache层可以在Controller和service二个层中相互调用。cache中的数据可以来源于dao也可以来源于service层

解决第一个问题，我的想法是，直接在sql执行时，获得sql语句分析sql , 决定是否更新redis中的数据。

二，做要实现redis主动缓存的相关配置，配置如下

<?php

return array(
			// db_user 表做数据缓存
			'db_user' => array(
							// 定义分类的字段，用于生成多个id索引set
							'cate' => array('group_id', 'vip'),
							// 设定排序所要用到的字段，数字
							'sort' => array('sort', 'create_time', 'last_time', 'login_num'),
							'callback' => array(
												'get' => array('common::getUserService()', 'getOne'), //回调类与方法，用于更新单个数据
												//用于当redis数据丢失的情况从mysql中还原数据，需要用于反射来注入数据，有待完善
												'getlist' => array('common::getUserService()', 'getList'), 
											), 

						),
		);

主要的意义在配置中告诉程序怎么输出，如果redis失效的情况下，绕过redis缓存系统，直接按回调中的方法从mysql中输出数据。

redis怎么存储缓存数据：

1. 用一个或多个sets 存 id号索引数据。比如配置中cate字段没有设置，就类型 listcache:db_user:ids

如果配置了cate字段则出现一组 listcache:db_user:ids:cate:group_id:1 sets来分别存放对应的ID号

2. 另使用一个hash来存储内容，结构因sort配置而变
hset list:cache:db_user:content:id:1 sort 1
hset list:cache:db_user:content:id:1 create_time 122323123
hset list:cache:db_user:content:id:1 last_time 1223231223
hset list:cache:db_user:content:id:1 login_num 20
hset list:cache:db_user:content:id:1 data 用户数据的序列化数据
上面非data用于排序使用

最后获取列表使用 sort 命令
例 sort list:cache:ids BY list:cache:db_user:content:id:*->sort DESC LIMIT 0 10
上面命令用于获取排过序id号数据
也可以直接获取最终的data数据 sort list:cache:ids BY list:cache:db_user:content:id:*->sort GET list:cache:db_user:content:id:*->data DESC LIMIT 0 10

就写到这吧，写得比较混乱。以后再整理些图出来，比较直观！

php路由实现rewrite改写url

最近着手写一个网站，从框架到应用全部重新开发，很多代码属于重造轮子，但主要的目的就是练手，因为发现最近思维有些固化了，是得好好从头到底写个项目了！

以前写的框架中路由功能非常有限，只是实现controller与action的选择，代码很简陋，没做过滤，安全性也有问题，所以就重写了一个路由。

框架出错提示

主要功能：

controller与action的选择
php正则rewrite美化url
参数过滤
PATH_INFO与REQUEST_URI自动选择

继续阅读 →

分享一个目前在使用的循环创建目录的函数

看起来很简单，递归 + &&运算符，习惯 !$a && $a = true; 这样的语法的比较容易看明白

static public function mkdir($path) {
	if (!is_dir($path) && self::mkdir(dirname($path))) return mkdir($path);
	return true;
}

更正一下，php5已可以直接mkdir($path, 0777, true);递归生成目录！

memcache一致性hash的php实现

有段时间没认真写博客了，最近在看一些分布式方面的文章，所以就用php实现一致性hash来练练手，以前一般用的是最原始的hash取模做分布式，当生产过程中添加或删除一台memcache都会造成数据的全部失效，一致性hash就是为了解决这个问题，把失效数据降到最低，相关资料可以google一下！

php实现效率有一定的缺失，如果要高效率，还是写扩展比较好
经测试，5个memcache，每个memcache生成200个虚拟节点，set加get1000次，采用一致性哈希分布效率比原生单台速度相差5倍，效率有待优化
实现过程：

memcache的配置 ip+端口+虚拟节点序列号做hash,使用的是crc32,形成一个闭环。
对要操作的key进行crc32
二分法在虚拟节点环中查找最近的一个虚拟节点
从虚拟节点中提取真实的memcache ip和端口，做单例连接

代码如下：继续阅读 →

redis做共享锁机制

给出部分主代码

感谢HaKeem的提醒，程序发现一个bug,以前的代码会发生无法解锁的问题，放出修正后的代码
如果是redis2.2以上版本可以在lock逻辑中加上watch命令，锁定单个key

	/**
	 * 加锁  【新的加锁算法，采用redis做锁，redis失效情况下，返回 变量redisErrReturn 设定的值】
	 * @param string $key 唯一标识
	 * @param int $expire
	 * @return
	 */
	public function lock($key, $expire = 5) {
		try {
			list($key, $lockIdKey) = $this->_getLockKey($key);
			//抢锁，第一个线程抢到，并把过期时间写入锁
			if (Common::getQueue()->setnx($key, Common::getTime() + $expire)) return true;
			//没有抢到锁的线程，判断锁是否异常死锁，如果锁没有过期，返回false
			if (Common::getQueue()->get($key)  > Common::getTime()) return false;
			//锁因异常死锁并过期的情况下，多个并发线程再次抢锁，getset命令到过期时间，如果未过期，表示锁已被其它线程抢得，返回false
			if (Common::getQueue()->getset($key, Common::getTime() + $expire) > Common::getTime()) return false;
			Common::getQueue()->set($lockIdKey, $this->_lockId);
			return true;
		} catch (RedisException $e) {
			// 当捕捉到redis异常时，锁中断，返回变量redisErrReturn
			$this->_lockStatus = false;
			return $this->redisErrReturn;
		}
	}
	
	/**
	 * 给锁进行续期，延长锁的生效周期
	 * @param string $key
	 * @param int $expire       延时锁失效的秒数
	 * @param int $triggerTime  触发续期倒计时间
	 * @return
	 */
	public function updateExpire($key, $expire = 3, $triggerTime = 2) {
		if (!$this->_lockStatus) return true; //当redis失效，中断锁
		list($key, $lockIdKey) = $this->_getLockKey($key);
		$lockId = Common::getQueue()->get($lockIdKey);
		if ($this->_lockId === null || $lockId != $this->_lockId) return false;
		$time = Common::getQueue()->get($key);
		if ($time - $triggerTime <= Common::getTime()) return Common::getQueue()->set($key, $time + $expire);
	}
	
	/**
	 * 解锁
	 * @param string $key
	 * @return
	 */
	public function unlock($key) {
		if (!$this->_lockStatus) return true; //当redis失效，中断解锁
		list($key, $lockIdKey) = $this->_getLockKey($key);
		$lockId = Common::getQueue()->get($lockIdKey);
		if ($this->_lockId === null || $lockId != $this->_lockId) return false;
		Common::getQueue()->del($key);
		Common::getQueue()->del($lockIdKey);
		return true;
	}
	
	/**
	 * 清除死锁产生的无用key 默认清除24小时以前的无用key
	 * @param int $cleanTime  默认24小时
	 */
	public function cleanLockKey($cleanTime = 86400) {
		if ($cleanTime < 3600 || !$keys = Common::getQueue()->keys('PwLock:*')) return false;
		$i = 0;
		foreach ($keys as $value) {
			list($time, $lockId) = explode('|', Common::getQueue()->get($value));
			if (Common::getTime() - $time > $cleanTime) {
				if (Common::getQueue()->del($value)) $i++;
			}
		}
		return $i;
	}
	
	private function _getLockKey($key) {
		return array('PwLock:' . $key, 'PwLock:' . $key . ':lockId');
	}

linux tar分卷压缩

分卷压缩一个目录：如linux
在linux目录的上层目录：
#tar cvf linux|split -b 2m (已2M大小分卷压缩)
#cat x* > linux.tar (合成分卷压缩包)
或者
#tar czvf linux.tar.gz linux/
#tar czvfp – linux.tar.gz | split -b 2m
#cat x* > linux.tar.gz

一	二	三	四	五	六	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

℃冻番茄's Blog

记录平时工作、学习…