博客折腾记

发表于 2017-02-05 | 分类于未分类

第一阶段，刚接触Blog，觉得很新鲜，试着选择一个免费空间来写。
第二阶段，发现免费空间限制太多，就自己购买域名和空间，搭建独立博客。
第三阶段，觉得独立博客的管理太麻烦，最好在保留控制权的前提下，让别人来管，自己只负责写文章。

上面这段换摘抄自阮一峰的博客，有一种似曾相识的感觉：

编程爱好者博客
百度Hi空间
CSDN
虚拟空间
域名+虚拟空间
域名+VPS
GitHub Pages+Hexo。

c++中new/delete重载总结

发表于 2015-05-31 | 分类于 C++

这个问题看了很多次，忘了很多次，最近在重温内存池时又看了下，做个笔记吧。

首先是两个名词：new表达式，new操作符，

默认情况下当你写下如下语句时:

Foo* foo = new Foo;(这里是new-expression)，实际执行了三个步骤：

1：new表达式调用一个operator new的标准库函数，分配一块足够大、原始的、未命名的内存空间

2：编译器运行相应的构造函数，并传入初始值

3：对象分配空间并构造完成，返回一个指向该对象的指针

delete foo，实际执行了两个相反的步骤：

1：调用该对象的析构函数

2：调用operator delete标准库函数，回收空间

c++中new/delete有如下级别：

global new/delete
class specific new/delete
通过代码来说明问题比较直观
重载全局操作符
```
#include <iostream>
#include <stdlib.h>
```

class Foo
{
public:
Foo()
{
std::cout << FUNCTION << std::endl;
}
~Foo()
{
std::cout << FUNCTION << std::endl;
}
};

//重载全局new操作符，和一般的操作符重载类似
void operator new(size_t sz)
{
std::cout << FUNCTION << std::endl;
void m = malloc(sz);
if(!m)
{
std::cerr << “out of memory” << std::endl;
}

return m;

}

//重载全局delete操作符，和一般的操作符重载类似
void operator delete(void* m)
{
std::cout << FUNCTION << std::endl;
free(m);
}

int main(int argc, const char argv[])
{
//new-expression(表达式)
//首先调用全局的operator(操作符) new分配内存,再调用构造函数
Foo foo = new Foo;
//delete-expression
//首先调用析构函数,再调用全局的operator(操作符) delete释放内存
delete foo;

return 0;

}

类操作符

#include <cstddef>
#include <iostream>
#include <new>

class Framis
{
enum { sz = 10 };
char c[sz]; // To take up space, not used
static unsigned char pool[];
static bool alloc_map[];
public:
enum { psize = 100 }; // frami allowed
Framis() { std::cout << FUNCTION << std::endl; }
~Framis() { std::cout << FUNCTION << std::endl; }
void operator new(size_t) throw(std::bad_alloc);
void operator delete(void);
};

unsigned char Framis::pool[psize * sizeof(Framis)];
bool Framis::alloc_map[psize] = {false};

// Size is ignored – assume a Framis object
void Framis::operator new(size_t) throw(std::bad_alloc)
{
for(int i = 0; i < psize; i++)
{
if(!alloc_map[i])
{
std::cout << “using block “ << i << “ … “;
alloc_map[i] = true; // Mark it used
return pool + (i sizeof(Framis));
}
}
std::cout << “out of memory” << std::endl;
throw std::bad_alloc();
}

void Framis::operator delete(void* m)
{
if(!m) return; // Check for null pointer
// Assume it was created in the pool
// Calculate which block number it is:
unsigned long block = (unsigned long)m - (unsigned long)pool;
block /= sizeof(Framis);
std::cout << “freeing block “ << block << std::endl;
// Mark it free:
alloc_map[block] = false;
}

int main()
{
Framis f[Framis::psize];
try
{
for(int i = 0; i < Framis::psize; i++)
{
f[i] = new Framis;
}
new Framis; // std::cout of memory
}
catch(std::bad_alloc)
{
std::cerr << “out of memory!” << std::endl;
}
delete f[10];
f[10] = 0;
// Use released memory:
Framis x = new Framis;
delete x;
for(int j = 0; j < Framis::psize; j++)
{
delete f[j]; // Delete f[10] OK
}

return 0;

}
上面的例子是从<<think in c++>>中copy过来的，类的new/delete操作符函数，尽管你可以不写static关键字，但是编译器其实把其当做static函数的。

另外还有个叫做placement new/delete表达式的重载，可以在指定的地址空间上构造/析构对象。

Designing Distributed Systems

发表于 2014-03-11 | 分类于 architecture ，转载

Bill Venners

What is important to keep in mind when you are designing a distributed system?

Ken Arnold

We should start off with some notion of what we mean by distributed system. A distributed system, in the sense in which I take any interest, means a system in which the failure of an unknown computer can screw you.

Failure is not such an important factor for some multicomponent distributed systems. Those systems are tightly controlled; nobody ever adds anything unexpectedly; they are designed so that all components go up and down at the same time. You can create systems like that, but those systems are relatively uninteresting. They are also quite rare.

Failure is the defining difference between distributed and local programming, so you have to design distributed systems with the expectation of failure. Imagine asking people, “If the probability of something happening is one in ten to the thirteenth, how often would it happen?” Your natural human sense would be to answer, “Never.” That is an infinitely large number in human terms. But if you ask a physicist, she would say, “All the time. In a cubic foot of air, those things happen all the time.” When you design distributed systems, you have to say, “Failure happens all the time.” So when you design, you design for failure. It is your number one concern.

Yes, you have to get done what you have to get done, but you have to do it in the context of failure. One reason it is easier to write systems with Jini and RMI (remote method invocation) is because they’ve taken the notion of failure so seriously. We gave up on the idea of local/remote transparency. It’s a nice thought, but so is instantaneous faster-than-light travel. It is demonstrably true that at least so far transparency is not possible.

Partial Failure

What does designing for failure mean? One classic problem is partial failure. If I send a message to you and then a network failure occurs, there are two possible outcomes. One is that the message got to you, and then the network broke, and I just couldn’t get the response. The other is the message never got to you because the network broke before it arrived. So if I never receive a response, how do I know which of those two results happened? I cannot determine that without eventually finding you. The network has to be repaired or you have to come up, because maybe what happened was not a network failure but you died.

Now this is not a question you ask in local programming. You invoke a method and an object. You don’t ask, “Did it get there?” The question doesn’t make any sense. But it is the question of distributed computing.

So considering the fact that I can invoke a method on you and not know if it arrives, how does that change how I design things? For one thing, it puts a multiplier on the value of simplicity. The more things I can do with you, the more things I have to think about recovering from. That also means the conceptual cost of having more functionality has a big multiplier. In my nightmares, I’ll tell you it’s exponential, and not merely a multiplier. Because now I have to ask, “What is the recovery strategy for everything on which I interact with you?” That also implies that you want a limited number of possible recovery strategies.

Transactions

So what are those recovery strategies? J2EE (Java 2 Platform, Enterprise Edition) and many distributed systems use transactions. Transactions say, “I don’t know if you received it, so I am forcing the system to act as if you didn’t.” I will abort the transaction. Then if you are down, you’ll come up a week from now and you’ll be told, “Forget about that. It never happened.” And you will.

Transactions are easy to understand: I don’t know if things failed, so I make sure they failed and I start over again. That is a legitimate, straightforward way to deal with failure. It is not a cheap way however.

Transactions tend to require multiple players, usually at least one more player than the number of transaction participants, including the client. And even if you can optimize out the extra player, there are still more messages that say, “Am I ready to go forward? Do you want to go forward? Do you think we should go forward? Yes? Then I think it’s time to go forward.” All of those messages have to happen.

And even with a two-phase commit, there are some small windows that can leave you in ambiguous states. A human being eventually has to interrupt and say, “You know, that thing did go away and it’s never coming back. So don’t wait.” Say you have three participants in a transaction. Two of them agree to go forward and are waiting to be told to go. But the third one crashes at an inopportune time before it has a chance to vote, so the first two are stuck. There is a small window there. I think it has been proven that it doesn’t matter how many phases you add, you can’t make that window go away. You can only narrow it slightly.

So the transactions approach isn’t perfect, although those kinds of problems happen rarely enough. Maybe instead of ten to the thirteenth, the probability is ten to the thirtieth. Maybe you can ignore it, I don’t know. But that window is certainly a worry.

The main point about transactions is that it has overhead. You have to create the transaction and you have to abort it. One of the things that a container like J2EE does for you is that it hides a lot of that from you. Most things just know that there’s a transaction around them. If somebody thinks it should be aborted, it will be aborted. But most things don’t have to participate very directly in aborting the transaction. That makes it simpler.

Idempotency

I tend to prefer something called idempotency. You can’t solve all problems with it, but basically idempotency says that reinvoking the method will be harmless. It will give you an equivalent result as having invoked it once.

If I want to manipulate a bank account, I send in an operation ID: “This is operation number 75. Please deduct $100 from this bank account.” If I get a failure, I just keep sending it until it gets through. If you’re the recipient, you say, “Oh, 75. I’ve seen that one, so I’ll ignore it.” It is a simple way to deal with partial failure. Basically, recovery is simple retry. Or, potentially, you give up by sending a cancel operation for the ID until that gets through. If you want to do that, though, you’re more likely to use transactions so you can abort them.

Generally, with idempotency, everybody needs to know how to go forward. But people don’t often need to know how to go back. I don’t abort a transaction. I just repeatedly try again until I succeed. That means I need to know how to say to do this. I don’t have to deal with all sorts of ugly recovery most of the time.

Now, what happens if failure increases on the network? You start sending messages more often. If that is a problem, for a long distance you can solve it by writing a check and buying more hardware. Hardware is much cheaper than programmers. Other ways of dealing with this tend to increase the system’s complexity, requiring more programmers.

Bill Venners

Do you mean transactions?

Ken Arnold

Transactions on everything can increase complexity. I’m just talking about transactions and idempotency now, but other recovery mechanisms exist.

If I just have to try everything twice, if I can simply reject the second request if something has already been done, I can just buy another computer and a better network—up to some limit, obviously. At some point, that’s no longer true. But a bigger computer is more reliable and cheaper than another programmer. I tend to like simple solutions and scaling problems that can be solved with checkbooks, even though I am a programmer myself.

Wide-Area Distributed Systems

Bill Venners

Is there anything in particular about Internet- wide distributed systems or large wide area networks that is different from smaller ones? Dealing with increased latency, for example?

Ken Arnold

Yes, latency has a lot to do with it. When you design anything, local or remote, efficiency one of the things you think about. Latency is an important issue. Do you make many little calls or one big call? One of the great things about Jini is that, if you can use objects, you can present an API whose natural model underneath deals with latency by batching up requests where it can. It adapts to the latency that it is in. So you can get away from some of it, but latency is a big issue.

Another issue is of course security. Inside a corporate firewall you say, “We’ll do something straightforward, and if somebody is mucking around with it, we’ll take them to court.” But that is clearly not possible on the Internet; it is a more hostile environment. So you either have to make things not care, which is fine when you don’t care if somebody corrupts your data. Or, you better make it so they can’t corrupt your data. So aside from latency, security is the other piece to think about in widely distributed systems.

State is Hell

Bill Venners

What about state?

Ken Arnold

State is hell. You need to design systems under the assumption that state is hell. Everything that can be stateless should be stateless.

Bill Venners

Define what you mean by that.

Ken Arnold

In this sense, state is essentially something held in one place on behalf of somebody who is in another place, something that is not reconstructible by the other entity that wants it. If I can reconstruct it, it’s called a cache. And caches are often OK. Caching strategies are their own branch of computer science, and you can screw them up. But as a general rule, I send you a bunch of information and you send me the shorthand for it. I start to interact with you using this shorthand. I pass the integer back to you to refer to the cached data: “I’m talking about interaction number 17.” And then you go down. I can send that same state to some equivalent service to you, and build myself back up. That kind of state is essentially a cache.

Caching can get complex. It’s the question that Jini solves with leasing. If one of us goes down, when can the person holding the cache throw this stuff away? Leasing is a good answer to that. There are other answers, but leasing is a pretty elegant one.

On the other hand, if you store information that I can’t reconstruct, then a whole host of questions suddenly surface. One question is, “Are you now a single point of failure?” I have to talk to you now. I can’t talk to anyone else. So what happens if you go down?

To deal with that, you could be replicated. But now you have to worry about replication strategies. What if I talk to one replicant and modify some data, then I talk to another? Is that modification guaranteed to have already arrived there? What is the replication strategy? What kind of consistency do you need—tight or loose? What happens if the network gets partitioned and the replicants can’t talk to each other? Can anybody proceed?

There are answers to these questions. A whole branch of computer science is devoted to replication. But it is a nontrivial issue. You can almost always get into some state where you can’t proceed. You no longer have a single point of failure. You have reduced the probability of not being able to proceed, but you haven’t eliminated it.

If my interaction with you is stateless in the sense I’ve described—nothing more than a cache—then your failure can only slow me down if there’s an equivalent service to you. And that equivalent service can come up after you go down. It doesn’t necessarily have to be there in advance for failover.

So, generally, state introduces a whole host of problems and complications. People have solved them, but they are hell. So I follow the rule: make everything you can stateless. If there is one piece of the system you can’t make stateless—it has to have state—to the extent possible make it hold all the state. Have as few stateful components as you can.

If you end up with a system of five components and two of them have state, ask yourself if only one can have state. Because, assuming all components are critical, if either of the two stateful components goes down you are screwed. So you might as well have just one stateful component. Then at least four, instead of three, components have this wonderful feature of robustability. There are limits to how hard you should try to do that. You probably don’t want to put two completely unrelated things together just for that reason. You may want to implement them in the same place. But in terms of defining the interfaces and designing the systems, you should avoid state and then make as many components as possible stateless. The best answer is to make all components stateless. It is not always achievable, but it’s always my goal.

Bill Venners

All these databases lying around are state. On the Web, every Website has a database behind it.

Ken Arnold

Sure, the file system underneath a Website, even if it’s just HTML, is a database. And that is one case where state is necessary. How am I going to place an order with you if I don’t trust you to hold onto it? So in that case, you have to live with state hell. But a lot of work goes into that dealing with that state. If you look at the reliable, high-performance sites, that is a very nontrivial problem to solve. It is probably the distributed state problem that people are most familiar with. Anyone who has dealt with any large scale, high availability, or high performance piece of the problem knows that state is hell because they’ve lived with that hell. So the question is, “Why have more hell than you need to have?” You have to try and avoid it. Avoid. Avoid. Avoid.

原文地址：
http://www.artima.com/intv/distrib.html.

liunx系统监控-vmstat

发表于 2014-02-23 | 分类于 Linux ，工具，转载

vmstat是一个很全面的性能分析工具，可以观察到系统的进程状态、内存使用、虚拟内存使用、磁盘的 IO、中断、上下问切换、CPU使用等。系统性能分析工具中，我使用最多的是这个，除了 sysstat 工具包外，这个工具能查看的系统资源最多。对于 Linux 的性能分析，100%理解 vmstat 输出内容的含义，那你对系统性能分析的能力就算是基本掌握了。我这里主要说明一下这个命令显示出的部分数据代表的含义，和它反映出系统相关资源的状况。输出内容共有 6 类，分别说明如下。

典型vmstat输出

procs

r运行的和等待(CPU时间片)运行的进程数，这个值也可以判断是否需要增加CPU(长期大于1)
b处于不可中断状态的进程数，常见的情况是由IO引起的

Memory

swpd: 切换到交换内存上的内存(默认以KB为单位) 如果 swpd 的值不为0，或者还比较大，比如超过100M了，但是 si, so 的值长期为 0，这种情况我们可以不用担心，不会影响系统性能。
free: 空闲的物理内存
buff: 作为buffer cache的内存，对块设备的读写进行缓冲
cache: 作为page cache的内存, 文件系统的cache 如果 cache 的值大的时候，说明cache住的文件数多，如果频繁访问到的文件都能被cache住，那么磁盘的读IO-bi会非常小。

Swap

si: 交换内存使用，由磁盘调入内存
so: 交换内存使用，由内存调入磁盘

内存够用的时候，这2个值都是0，如果这2个值长期大于0时，系统性能会受到影响。磁盘IO和CPU资源都会被消耗。我发现有些朋友看到空闲内存(free)很少或接近于0 时，就认为内存不够用了，实际上不能光看这一点的，还要结合si,so，如果free很少，但是si,so也很少(大多时候是0)，那么不用担心，系统性能这时不会受到影响的。

Io

bi: 从块设备读入的数据总量(读磁盘) (KB/s)，
bo: 写入到块设备的数据总理(写磁盘)(KB/s)

随机磁盘读写的时候，这2个值越大（如超出1M），能看到CPU在IO等待的值也会越大

System

in: 每秒产生的中断次数
cs: 每秒产生的上下文切换次数上面这2个值越大，会看到由内核消耗的CPU时间会越多

Cpu

us: 用户进程消耗的CPU时间百分比 us的值比较高时，说明用户进程消耗的CPU时间多，但是如果长期超过50% 的使用，那么我们就该考虑优化程序算法或者进行加速了(比如 PHP/Perl)
sy: 内核进程消耗的CPU时间百分比 sy 的值高时，说明系统内核消耗的CPU资源多，这并不是良性的表现，我们应该检查原因。
wa: IO等待消耗的CPU时间百分比 wa 的值高时，说明IO等待比较严重，这可能是由于磁盘大量作随机访问造成，也有可能是磁盘的带宽出现瓶颈(块操作)。
id: CPU处在空闲状态时间百分比

情景分析

这个vmstat的输出那些信息值得关注

Procs r: 运行的进程比较多，系统很繁忙
Io bo: 磁盘写的数据量稍大，如果是大文件的写，10M以内基本不用担心，如果是小文件写2M以内基本正常
Cpu us: 持续大于50，服务高峰期可以接受
Cpu wa: 稍微有些高
Cpu id:持续小于50，服务高峰期可以接受

转载并整理自：http://fafeng.blogbus.com/logs/6541705.html

利用分布式缓存实现分布式锁

发表于 2013-11-10 | 分类于 architecture

背景

之前在系统中有做一个在客户端重试时，进行去重的逻辑，大致思路是将客户端的请求缓存到分布式缓存中，每次请求到达时先在缓存中查询，如果有就直接返回，没有就帮用户重试该请求。一天晚上(不凑巧，刚好是结婚当天)，一个微信会话来了，某某老大的评论重复……

分析

上面在写数据前，会先请求一次缓存，但是这个操作在分布式环境下显然会有如下场景出现：

①　节点A，Get数据，返回不存在

②　节点B，Get数据，返回不存在

③　节点A写数据

④　节点A返回

⑤　节点B写数据

⑥　节点B返回

根本原因就是Get数据和Set数据不是原子操作，导致出现了A，B都认为数据不存在。

问题很明显，就是没有对数据操作进行同步，如果在单机环境下，这个可以很简单的通过操作系统提供的各种同步原语处理，但是现在节点A，B是处在不同机器上，这就涉及到分布式锁的问题了。类似于单机环境下的线程同步原语，我们只需要一种机制让应用程序知道某个资源被占用了（例如mutex如果lock失败，操作系统即会将该进程挂起），在分布式缓存中一般都存在一个叫做add的操作，该操作保证只有在资源不存在时才能执行成功，否则会告知调用者失败，且标注为特殊的错误码。

实现

这里只简单给出了获取锁一般实现伪代码，不同的业务场景有不同的处理，比如失败后继续重试直到成功。

在memcache中：

if(cache->add(key, value, expire))
{
    //get lock successful
}
else
{
    //get lock fail
}

在Redis中可以通过SET命令(2.6.12以上版本)或者SETNX命令，类似memcache中的ADD

SET key value NX EX max_lock_time。在redis-py中有一个封装好的lock实现可以直接使用。

总结

像memcache，redis这里系统一般都是作为缓存来使用，但是在某些时候通过深入挖掘其实也可以有一些意想不到的作用，通过一个简单的语句既可以实现一个基本上够用的分布式锁，其性价比不言而喻。在网上随便一搜，也有好多类似的同行遇到这个问题，下面是几个链接，有国产的也有国外：

http://abhinavsingh.com/blog/2009/12/how-to-use-locks-for-assuring-atomic-operation-in-memcached/

http://chuyinfeng.com/p/39

http://blog.webfuns.net/archives/1722.html

http://jiangbo.me/blog/2011/02/27/post/

Thrift多路复用的设计与实现

发表于 2013-10-20 | 分类于 C++ ，设计模式

# 介绍

Thrift作为一个跨语言的rpc框架，为后端服务间的多语言混合开发提供了高可靠，可扩展，以及高效的实现。但是自从2007年facebook开源后的6年时间内一直缺少一个多路复用的功能(Multiplexing Services)，也就是在一个Server上面实现多个Service调用。比如要实现一个后端服务，很多时候不仅仅只是实现业务服务接口，往往还会预留一些调试，监控服务接口，以往要实现多个Service在一个Server上面调用，要么将多个服务糅合成一个庞大的服务（图1），要么将多个Service分摊到不同的Server上（图2）。最近的版本0.9.1终于实现内置的多路复用，本文将就该功能简要分析一下该功能的设计与实现，并提供一个简单的示例。

图1

图2

## 设计

Thrift的架构如下：

Thrift采取了很优雅的分层设计，下面简述各层的主要功能：

Ø Transport

负责传输数据的接口，有文件，内存，套接字等实现。

Ø Protocol

负责数据的协议交互接口，有二进制，Json等实现。

Ø Processor

负责输入输出数据处理的接口，其实就是对Protocol的处理。

Ø Server

包括了上面各层的创建和管理，并提供网络服务的功能，网络服务这块目前有四个实现，分别是最简单的单线程阻塞，多线程阻塞，线程池阻塞和基于libevent的非阻塞模式。

实现多路复用有以下几个需要注意的地方：

Ø 不修改现有的Thrift代码，主要是指向后兼容，旧代码可以在新版本中编译

Ø 不修改白皮书上说的协议

Ø 不依赖任何Server层代码，也不需要新的Server实现

0.9.1之前版本不能多个Service在同一个Server上面调用，主要是因为在协议层只把Service的函数名打包，没有将ServiceName打包进去，所以在Processor层默认只能处理一个Service。

新版本中通过新增以下设计完成了Service的多路复用：

深色部分为新增部分，需要多个Service复用Server的客户端需要使用新的Protocol实现，支持Service复用的服务端需要通过新的Processor实现注册不同的Service。

## 实现

从release-notes看到目前已经有Java，Delphi，C#，C++几个语言实现了该功能，具体到C++实现按照THRIFT-1902的介绍，是直接移植的java版本。

### 新增代码

protocol/TMultiplexedProtocol.h

protocol/TMultiplexedProtocol.cpp

protocol/TProtocolDecorator.h

processor/TMultiplexedProcessor.h

### 主要逻辑

1) TMultiplexedProtocol类重写writeMessageBegin_virt方法，在Service的函数名前新增了ServiceName，并附带了一个分隔符；该类采取了decorator模式将绝大部分操作转发到实际的protocol对象。

2) TMultiplexedProcessor类增加了一个存放service的map，

typedef std::map< std::string, shared_ptr<TProcessor> > services_t;

key为ServiceName，这样在process方法中先解出TMultiplexedProtocol传递过来的ServiceName，通过该key查找到对应的Processor对象，再调用该Processor的process方法，这样就完美的实现了多个serivice的共存。

## 使用示例

IDL:

namespace cpp thrift.multiplex.demo

service FirstService
{
    void blahBlah()
}

service SecondService
{
    void blahBlah()
}

Server:

int port = 9090;
shared_ptr<TProcessor> processor1(new FirstServiceProcessor
         (shared_ptr<FirstServiceHandler>(new FirstServiceHandler())));
shared_ptr<TProcessor> processor2(new SecondServiceProcessor
         (shared_ptr<SecondServiceHandler>(new SecondServiceHandler())));

//使用MultiplexedProcessor
shared_ptr<TMultiplexedProcessor> processor(new TMultiplexedProcessor());

//注册各自的Service
processor->registerProcessor(“FirstService”, processor1);
processor->registerProcessor(“SecondService”, processor2);

shared_ptr<TServerTransport> serverTransport(new TServerSocket(port));
shared_ptr<TTransportFactory> transportFactory(new TBufferedTransportFactory());
shared_ptr<TProtocolFactory> protocolFactory(new TBinaryProtocolFactory());
TSimpleServer server(processor, serverTransport, transportFactory, protocolFactory);
server.serve();

Client:

shared_ptr<TSocket> transport(new TSocket(“localhost”, 9090));
transport->open();

shared_ptr<TBinaryProtocol> protocol(new TBinaryProtocol(transport));
shared_ptr<TMultiplexedProtocol> mp1(new TMultiplexedProtocol(protocol, “FirstService”));
shared_ptr<FirstServiceClient> service1(new FirstServiceClient(mp1));
shared_ptr<TMultiplexedProtocol> mp2(new TMultiplexedProtocol(protocol, “SecondService”));
shared_ptr<SecondServiceClient> service2(new SecondServiceClient(mp2));

service1->blahBlah();
service2->blahBlah();

## 总结

Thrift通过在客户端采取的decorator模式巧妙的在协议层将ServiceName传递到服务端，服务端通过常见的手段将ServiceName和具体的Processor进行映射，完美的解决了Service的多路复用问题。

## 参考资料

1. http://thrift.apache.org/docs/concepts/

2. https://issues.apache.org/jira/browse/THRIFT-1902

3. https://issues.apache.org/jira/browse/THRIFT-563

4. http://bigdata.impetus.com/whitepaper

qzone说说架构笔记

发表于 2013-09-09 | 分类于未分类

今天听了QZone说说部分的架构分享，是这个系列分享迄今为止干货最多的一次，分享者讲得比较快，但是都很实在，个人觉得主要有以下几点值得学习：

对CAP的应用，说说选择了AP，对于C是最终一致，大多数互联网系统应该都是这样
削峰填谷，主要是每年除夕，元旦等高峰期请求量暴涨，这个时候如果只是盲目的进行扩容是不太合适的，通过类似队列方式进行异步处理是比较好的削峰手段(新浪微博计数服务采用这个)，由于说说属于UGC性质，用户对实时性和准确性要求较高，采用的是一个单独的cache服务，且只会在判断为高峰期才会自动打开
高峰期时的柔性可用，有选择的舍弃一些不太重要的服务，比如同步到个性签名，微博等
数据存储结构（按uin存，每个uin最多保留3条记录），这个方案跟最近我遇到的一个问题类似，不过我最后还是采取的按照session来存数据

通过开源软件赚钱的几个途径

发表于 2013-09-01 | 分类于未分类

近两年开始完全转向后台相关开发后，接触了不少开源软件，学习和使用优秀的开源软件不仅是对自己技术方面的提升，更能直接的获得不少物质上的收益，下面是我目前总结的两个途径：

写书

展开来讲，跟开源软件相关的书籍主要有两类：

源码剖析，比较典型的有lighttpd源码剖析，hadoop源码分析；
入门指导之类，比如mongdb权威指南，hadoop权威指南
做增值服务

目前知道的有以下几类：

cloudera之于hadoop
Garantia Data对memcached和redis提供的集群服务
ngxin成立了一个单独的商业公司提供更强的功能
为mongodb提高监控功能的serverdensity和mongodirector
貌似源码分析之类的书基本上都是国内，权威指南之类的大多先是国外，增值服务的基本上都是国外的

vmware虚拟机copy或move

发表于 2013-07-29 | 分类于未分类

今天把用了三年多的xp换成了win7，有很多软件是不需要重装的，但是vmware必须得重装，接踵而至的就是vmware虚拟的IP段和网卡地址变化，导致打开原有的虚拟机时会提示类似“I copy it or move it”。

这个应该是因为vmware网卡变化导致的，网上有找到这个copy和move的区别：

选择I copied it的时候，VMware软件检测到物理机改变后会对个虚拟机重新生成新的网卡MAC地址，UUID
选择I moved it，只改变UUID，虚拟机其它配置不变
我一般都习惯性的选择move，看来这个习惯还是正确的，呵呵，但是虚拟机IP还是得修改才能方便使用，例如重装前装的虚拟机IP大致如下：

suse_ifconfig_old
而重装后vmware的IP地址发生了变化：
vmware8_ipconfig 这就导致打开的虚拟机IP段和vmware的不匹配，从而无法与vmware通信，进而与本机实体机通信，最简单的临时方案时先在vmware中登陆，修改虚拟机的IP：
modify_ip

后面就可以通过secretcrt之类工具在windows使用了，但是每次都这样改也不是长久之计，终极方案是彻底修改掉其IP，在suse下面是如下文件：

/etc/sysconfig/network/ifcfg-eth-id-00:0c:29:c5:67:f8(后面这串数字其实就是虚拟机的mac)

修改后直接重启即可。

如果选择copy，好像是会改变网络接口名，也就是这个eth0,eth1，在/etc/udev/rules.d/30-net_persistent_names.rules文件中有如下记录：

suse_udev_net

gcc升级

发表于 2013-07-06 | 分类于 Linux ，工具

http://gcc.gnu.org/install/这个是官方安装指导，比较全面，在网上看到一些方法大都不用这么复杂。下面记录一下前两天在我的CentOS 6.2上面的升级记录：

1下载安装包，包括gmp-4.3.2.tar.bz2，mpfr-2.4.2.tar.bz2，mpc-0.8.1.tar.gz，gcc-4.7.2.tar.bz2

2由于安装包之间有依赖关系，必须依次安装，步骤为

tar jxf gmp-4.3.2.tar.bz2
cd gmp-4.3.2
./configure –prefix=/usr/local/gmp
make && make install
cd ../
tar jxf mpfr-2.4.2.tar.bz2
cd mpfr-2.4.2
./configure –prefix=/usr/local/mpfr –with-gmp=/usr/local/gmp
make && make install
cd ../
tar xzf mpc-0.8.1.tar.gz
cd mpc-0.8.1
./configure –prefix=/usr/local/mpc –with-mpfr=/usr/local/mpfr –with-gmp=/usr/local/gmp
make && make install
cd ../
tar jxf gcc-4.7.2.tar.bz2
cd gcc-4.7.2
./configure –prefix=/usr/local/gcc –enable-threads=posix –disable-checking –disable-multilib –enable-languages=c,c++ –with-gmp=/usr/local/gmp –with-mpfr=/usr/local/mpfr/ –with-mpc=/usr/local/mpc/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mpc/lib:/usr/local/gmp/lib:/usr/local/mpfr/lib/
make && make install
在我的虚拟机上面2个小时才搞定，一切顺利的话剩下就是设置新的gcc，主要有两种方式：

通过aliasalias g++=’/usr/local/gcc/bin/g++’
alias gcc=’/usr/local/gcc/bin/gcc’
通过ln建立软链接
mv /usr/bin{gcc,g++} /home/backup（记得一定要备份一下）

ln -s /usr/local/gcc/bin/gcc /usr/bin/gcc

ln -s /usr/local/gcc/bin/g++ /usr/bin/g++

另外如果可以上网的话可以按照这篇文字的介绍安装（没有实测过，应该是可以的）

http://www.cnblogs.com/linbc/archive/2012/08/03/2621169.html

2013.10.17更新：

遇到/usr/lib/libstdc++.so.6: version `GLIBCXX_3.4.15’ not found问题，主要是没有将相应的库更新，在gcc源码目录里面找到最新的库，gcc-4.7.2里面最新的是libstdc++.so.6.0.17，将其拷贝到/usr/lib/目录下面，删除原有链接，新建链接即可：

rm -rf `/usr/lib/libstdc++.so.6`

ln -s `/usr/lib/libstdc++.so.6.0.17/usr/lib64/libstdc++.so.6`

2013.10.20更新：

今天在使用scons时发现提示”/usr/local/gcc/libexec/gcc/i686-pc-linux-gnu/4.7.2/cc1plus: error while loading shared libraries: libmpc.so.2: cannot open shared object file: No such file or directory”

主要是由于在scons中定义Environment()时没加ENV = os.environ，导致在~/.bash_profile中导出的路径无效了，加上这个就可以了。

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mpc/lib:/usr/local/gmp/lib:/usr/local/mpfr/lib/