General Design Issues of the Internet

刚刚看到RFC1958上面关于Internet设计的一些原则,觉得对现在的系统设计都很有启发,摘录如下:

3.1 Heterogeneity is inevitable and must be supported by design.
Multiple types of hardware must be allowed for, e.g. transmission
speeds differing by at least 7 orders of magnitude, various computer
word lengths, and hosts ranging from memory-starved microprocessors
up to massively parallel supercomputers. Multiple types of
application protocol must be allowed for, ranging from the simplest
such as remote login up to the most complex such as distributed
databases.

3.2 If there are several ways of doing the same thing, choose one.
If a previous design, in the Internet context or elsewhere, has
successfully solved the same problem, choose the same solution unless
there is a good technical reason not to. Duplication of the same
protocol functionality should be avoided as far as possible, without
of course using this argument to reject improvements.

3.3 All designs must scale readily to very many nodes per site and to
many millions of sites.

3.4 Performance and cost must be considered as well as functionality.

3.5 Keep it simple. When in doubt during design, choose the simplest
solution.

3.6 Modularity is good. If you can keep things separate, do so.

3.7 In many cases it is better to adopt an almost complete solution
now, rather than to wait until a perfect solution can be found.

3.8 Avoid options and parameters whenever possible. Any options and
parameters should be configured or negotiated dynamically rather than
manually.

3.9 Be strict when sending and tolerant when receiving.
Implementations must follow specifications precisely when sending to
the network, and tolerate faulty input from the network. When in
doubt, discard faulty input silently, without returning an error
message unless this is required by the specification.

3.10 Be parsimonious with unsolicited packets, especially multicasts
and broadcasts.

3.11 Circular dependencies must be avoided.

For example, routing must not depend on look-ups in the Domain
Name System (DNS), since the updating of DNS servers depends on
successful routing.

3.12 Objects should be self decribing (include type and size), within
reasonable limits. Only type codes and other magic numbers assigned
by the Internet Assigned Numbers Authority (IANA) may be used.

3.13 All specifications should use the same terminology and notation,
and the same bit- and byte-order convention.

3.14 And perhaps most important: Nothing gets standardised until
there are multiple instances of running code.

[openstack]NAT gateway和port不一致导致VM不能到外网

当VM设置完floatingip后,VM还是不能连接外网,排查原因,发现是quantum中设置的问题:

quantum中设置外网为192.168.19.129/25,不设网关,allocation_pools为{“start”: “192.168.19.130”, “end”: “192.168.19.254”}。

root@controller:/usr/src/nova# ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.19.129 0.0.0.0 UG 0 0 0 qg-29c30020-2e
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-cd728374-d8
10.0.1.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-f915c799-96
192.168.19.128 0.0.0.0 255.255.255.128 U 0 0 0 qg-29c30020-2e

路由器的网卡却是:

root@controller:/usr/src/nova# ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:14 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1390 (1.3 KB)  TX bytes:1390 (1.3 KB)

qg-29c30020-2e Link encap:Ethernet  HWaddr fa:16:3e:10:18:21  
          inet addr:192.168.19.130  Bcast:192.168.19.255  Mask:255.255.255.128
          inet6 addr: fe80::f816:3eff:fe10:1821/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:87 errors:0 dropped:0 overruns:0 frame:0
          TX packets:67 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:12593 (12.5 KB)  TX bytes:9608 (9.6 KB)

qr-cd728374-d8 Link encap:Ethernet  HWaddr fa:16:3e:d7:5a:2f  
          inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fed7:5a2f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:64 errors:0 dropped:0 overruns:0 frame:0
          TX packets:89 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9710 (9.7 KB)  TX bytes:10627 (10.6 KB)

qr-f915c799-96 Link encap:Ethernet  HWaddr fa:16:3e:96:89:3a  
          inet addr:10.0.1.1  Bcast:10.0.1.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe96:893a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:594 (594.0 B)

这两个值是不同的,本应从192.168.19.130路由的数据包均发往192.168.19.129,导致VM无法出去。其实后者是quantum中与外网连接的port中的fixed_ips值:

+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                             |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+
| 10d13e25-cc01-4edc-aba4-5e2b3a6dff80 |      | fa:16:3e:e6:9e:30 | {"subnet_id": "169ad3b8-c961-4128-b053-2d6d36afbe1f", "ip_address": "10.0.0.4"}       |
| 29c30020-2e91-4ffa-91e3-a8acef553641 |      | fa:16:3e:10:18:21 | {"subnet_id": "3f53264f-683b-45a8-a7ab-289afd2288b5", "ip_address": "192.168.19.130"} |
| 7e659611-43b3-4f52-b392-28ddd5051bca |      | fa:16:3e:9e:84:c8 | {"subnet_id": "3f53264f-683b-45a8-a7ab-289afd2288b5", "ip_address": "192.168.19.131"} |
| 7f000789-2e36-4aef-8d08-acb700ddde9f |      | fa:16:3e:07:92:81 | {"subnet_id": "169ad3b8-c961-4128-b053-2d6d36afbe1f", "ip_address": "10.0.0.2"}       |
| 91da98b9-e9df-4a2c-b97d-02299d33fe89 |      | fa:16:3e:f7:42:d9 | {"subnet_id": "3f53264f-683b-45a8-a7ab-289afd2288b5", "ip_address": "192.168.19.132"} |
| a132b58c-238a-4b9f-92ce-c47521cda668 |      | fa:16:3e:31:81:8e | {"subnet_id": "169ad3b8-c961-4128-b053-2d6d36afbe1f", "ip_address": "10.0.0.3"}       |
| b1a9afa6-6850-4044-a2b6-cca6c12fc6fa |      | fa:16:3e:89:2e:fb | {"subnet_id": "0636c5f2-70ab-4fb9-a7d5-986c92eaf1aa", "ip_address": "10.0.1.2"}       |
| b629349e-ad6e-427a-8aae-291f55ef4b32 |      | fa:16:3e:31:a2:cf | {"subnet_id": "169ad3b8-c961-4128-b053-2d6d36afbe1f", "ip_address": "10.0.0.5"}       |
| cd728374-d89e-4f64-b437-b3e1580b49e9 |      | fa:16:3e:d7:5a:2f | {"subnet_id": "169ad3b8-c961-4128-b053-2d6d36afbe1f", "ip_address": "10.0.0.1"}       |
| f915c799-96aa-40bf-a3aa-06d43bc1c284 |      | fa:16:3e:96:89:3a | {"subnet_id": "0636c5f2-70ab-4fb9-a7d5-986c92eaf1aa", "ip_address": "10.0.1.1"}       |
+--------------------------------------+------+-------------------+---------------------------------------------------------------------------------------+

如果设置该网络的网关为130,提示失败:

# quantum subnet-update userA-public --gateway_ip 192.168.19.130
Gateway ip 192.168.19.130 conflicts with allocation pool 192.168.19.130-192.168.19.254

在quantum代码中体现是:
agent/l3_agent.py

        ex_gw_ip = ex_gw_port['fixed_ips'][0]['ip_address']
        if not ip_lib.device_exists(interface_name,
                                    root_helper=self.root_helper,
                                    namespace=ri.ns_name()):
            
......

        gw_ip = ex_gw_port['subnet']['gateway_ip']
        if ex_gw_port['subnet']['gateway_ip']:
            cmd = ['route', 'add', 'default', 'gw', gw_ip]

不知道为什么这里有两个:ex_gw_ip和gw_ip,不一致导致这个问题。

workaround很简单:

# ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 route del default gw 192.168.19.129
# ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 route add default gw 192.168.19.130

——————————我是分割线——————————————–
上面的workaround很是麻烦,每次重启l3agent都需要添加,我今天看了一下这个问题,其实还是因为我们对neutron网络不太理解造成的。我前面的方法是先将数据包扔到qg-f103a9f2-d6接口,然后在命名空间外的路由表中进行路由决策:

# ip netns exec qrouter-09be29ea-25f6-4a53-b3ab-8d0e13dc7198 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.19.130  0.0.0.0         UG    0      0        0 qg-f103a9f2-d6
100.0.0.0       0.0.0.0         255.255.255.0   U     0      0        0 qr-d8fcb028-ea
192.168.19.0    0.0.0.0         255.255.255.0   U     0      0        0 qg-f103a9f2-d6
200.0.0.0       0.0.0.0         255.255.255.0   U     0      0        0 qr-b422c431-d8

# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.19.254  0.0.0.0         UG    100    0        0 br-ex
20.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 eth1
30.0.0.0        0.0.0.0         255.255.255.0   U     0      0        0 eth2
192.168.19.0    0.0.0.0         255.255.255.0   U     0      0        0 br-ex

数据包的流向是qr-d8fcb028-ea->(namespace routing)->qg-f103a9f2-d6->(routing)->br-ex->eth0->router

其实public-net本身就是一个外网,所以应该跟物理机的网络一致,也就是192.168.19.0/24,网关是物理网关192.168.19.254。这样,每次l3agent都会在命名空间中新建默认路由:

# ip netns exec qrouter-09be29ea-25f6-4a53-b3ab-8d0e13dc7198 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.19.254  0.0.0.0         UG    0      0        0 qg-f103a9f2-d6
100.0.0.0       0.0.0.0         255.255.255.0   U     0      0        0 qr-d8fcb028-ea
192.168.19.0    0.0.0.0         255.255.255.0   U     0      0        0 qg-f103a9f2-d6
200.0.0.0       0.0.0.0         255.255.255.0   U     0      0        0 qr-b422c431-d8

这样数据包到了这个命名空间后,直接经过路由决策从qg-f103a9f2-d6经过br-ex到eth0出去了。虽然数据包流向与前面的一样,但是从命名空间到物理网关还是在一个网络中流动。

VM上不了网的一个原因

Openstack中VM上不了网有很多原因,今天遇到一个,其实之前也遇到过,只是不熟了才调试了半天,悲剧。。。

现象:VM联网速度很慢,例如apt-get update能连上主机,但是半天下载不了多少东西

调试:因为VM能上网,所以开始以为是quantum的l3问题,在命名空间下查看iptables和route,均没有问题。。
root@controller:/usr/src/nova# ip netns
qdhcp-eb2fc4cd-d656-4e64-adc2-001d3cfbcebd
qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02
qdhcp-77a8d872-103a-4d8c-9f47-bc6ec34a2ff4
qdhcp-faacf658-dae9-4230-8fbc-7cde47c425b1

然后用抓包:
ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 tcpdump -i qg-29c30020-2e (外网网卡)
15:20:00.465882 IP 192.168.19.131 > likho.canonical.com: ICMP 192.168.19.131 unreachable – need to frag (mtu 1454), length 556
15:20:02.046704 IP likho.canonical.com.http > 192.168.19.131.56147: Flags [.], seq 1:1449, ack 256, win 61, options [nop,nop,TS val 3517237056 ecr 105897], length 1448
15:20:02.046825 IP 192.168.19.131 > likho.canonical.com: ICMP 192.168.19.131 unreachable – need to frag (mtu 1454), length 556
ip netns exec qrouter-b4721d20-9d39-4d4d-9c37-f18ecb460d02 tcpdump -i qr-cd728374-d8 (内网网卡)
15:20:02.046763 IP likho.canonical.com.http > 10.0.0.4.56147: Flags [.], seq 1:1449, ack 256, win 61, options [nop,nop,TS val 3517237056 ecr 105897], length 1448
15:20:02.046800 IP 10.0.0.4 > likho.canonical.com: ICMP 10.0.0.4 unreachable – need to frag (mtu 1454), length 556
15:20:02.541472 IP sudice.canonical.com.http > 10.0.0.4.39679: Flags [.], seq 1:1449, ack 258, win 61, options [nop,nop,TS val 1537132828 ecr 105323], length 1448
15:20:02.541502 IP 10.0.0.4 > sudice.canonical.com: ICMP 10.0.0.4 unreachable – need to frag (mtu 1454), length 556
发现出现很多unreachable的问题,开始以为是内网没经过SNAT出去,后来才发现错误信息重点是need to frag

原因是VM的MTU太小了,需要设置一个大一些的值,那么处理就很简单了,直接在VM中运行:
ifconfig eth0 mtu 1400

DONE

魔高一尺,道高一丈

背景:街道阿姨告诉我党员学习搞积分,上某先锋网一个,满90分钟为止,鼠标不动就不算时间。反人类啊!

码农背景的我仔细看了一下计时相关的部分,用js实现。于是上个月写了一行代码,用chrome+javascript书签搞定了。参见以前我发的微博链接

这个月一看,发现代码重写了,增加了阅读页数限制,究竟是哪个天杀的码农,难道看到我的微博了?更发指的是还增加了浏览器限制,为了自己的便利一遍一遍地强奸小白用户,某些中国程序员的特色

2

不过重写的代码还是用js实现的,照样搞定!效果如图

1

代码奉上:

javascript:function refreshpage(){ count=10;addtime(); setTimeout('refreshpage()',10000); } refreshpage();

每十秒钟读十页,我还算好党员吧?

还是那句话,XX这种事情何必呢,码农别为难码农… 如有各种不服,下个月继续pk,顺便提一句,用javascript实现各种限制就是纯粹耍流氓

ruby的ssl问题

师弟的ruby出了点问题,启动rails的时候报错:

$ ruby script/server
=> Booting Mongrel (use ‘script/server webrick’ to force WEBrick)
=> Rails 2.1.0 application starting on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with development environment…
Exiting
/usr/local/lib/ruby/gems/1.8/gems/rails-2.1.0/lib/initializer.rb:225:in `require_frameworks’: no such file to load — openssl (RuntimeError)
from /usr/local/lib/ruby/gems/1.8/gems/rails-2.1.0/lib/initializer.rb:113:in `process’
from /usr/local/lib/ruby/gems/1.8/gems/rails-2.1.0/lib/initializer.rb:93:in `send’
from /usr/local/lib/ruby/gems/1.8/gems/rails-2.1.0/lib/initializer.rb:93:in `run’
from /home/xulei/ROR/mybook/config/environment.rb:13
from /usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in `gem_original_require’
…………………….

一大堆错误。

尝试使用apt安装ruby和rails,没想到ubuntu 8.04存在依赖性关系,未遂。看来这件事情将我转到ubuntu的yy想法无情打破。

源码安装ruby和rubygem,然后gem安装rails,还是以上问题,未遂。

google一下,说是先装openssl-dev,然后“reconfigure all the ruby packages from scratch”,一直没理解这个reconfigure是什么意思。后来弄明白了,原来是到ruby的源码目录configure、make和make install,这样就ok了。

ror的效率确实不错

用三个晚上搞定了实验室的主页(http://pact518.hit.edu.cn),没太费劲。

主要是业务流程做起来比较快,但是ror效率高当然是指在熟悉ror的前提下,不然如果出了问题,找问题的时间完全就可以拿来做一个新的站点了,呵呵。

其实我对ruby的语法还是不太熟,不过做比较简单的应用已经够用了:-)

firefox3扩展的id

一个典型的install.rdf如下:

<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:em=”http://www.mozilla.org/2004/em-rdf#”>

marvelliu.@gmail.com
SurfLilac
0.0.1

{ec8030f7-c20a-464f-9b0e-13a3a9e97384}
1.5
3.*


marvel
happygirl
Surf on lilacbbs.com
http://lilacbbs/firefox/lilac/
chrome://surflilac/skin/surflilac.png

其中,第一个em:id表示程序的id,这个有两种形式,一种是类似于电子邮件的格式,例如foo@a.com,但是不一定需要是你的真实邮件地址;第二种是gid,具体如何生成gid可以参考http://developer.mozilla.org/en/docs/Generating_GUIDs。

第二个em:id表示这个扩展安装的宿主程序的id,这个可以在https://addons.mozilla.org/en-US/firefox/pages/appversions查到。
我当时以为这个em:id是随便写的,结果提示firefox和这个扩展不兼容,ft,呵呵。记住,如果是firefox(版本为0.3, 0.6, 0.7, 0.7+, 0.8, 0.8+, 0.9.x, 0.9, 0.9.0+, 0.9.1+, 0.9.2+, 0.9.3, 0.9.3+, 0.9+, 0.10, 0.10.1, 0.10+, 1.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0+, 1.4, 1.4.0, 1.4.1, 1.5b1, 1.5b2, 1.5, 1.5.0.4, 1.5.0.*, 2.0a1, 2.0a2, 2.0a3, 2.0b1, 2.0b2, 2.0, 2.0.0.4, 2.0.0.8, 2.0.0.*, 3.0a1, 3.0a2, 3.0a3, 3.0a4, 3.0a5, 3.0a6, 3.0a7, 3.0a8pre, 3.0a8, 3.0a9, 3.0b1, 3.0b2pre, 3.0b2, 3.0b3pre, 3.0b3, 3.0b4pre, 3.0b4, 3.0b5pre, 3.0b5, 3.0pre, 3.0, 3.0.*, 3.1a1pre)的扩展,这个值永远是{ec8030f7-c20a-464f-9b0e-13a3a9e97384}。
随便说一句,firefox应该是通过扩展的em:id来辨别扩展,所以不要图省事用别人的扩展的em:id 🙂

Firefox3 不错

在linux下面的反应速度明显快了不少,基本上不卡,现在的软件是越做越好了:-)

刚刚下了一个firefox用的扩展scribefire,用来写blog的,感觉不错。能够分析blog类型,支持WS的各种功能。

恩,这篇文章就是用scribefire写完发布的。以后就考虑用它作为离线的blog编辑器了。