from binascii import hexlify from ctypes import create_string_buffer, addressof from socket import socket, AF_PACKET, SOCK_RAW, SOL_SOCKET from struct import pack, unpack # A subset of Berkeley Packet Filter constants and macros, as defined in # linux/filter.h. # Instruction classes BPF_LD = 0x00 BPF_JMP = 0x05 BPF_RET = 0x06 # ld/ldx fields BPF_H = 0x08 BPF_B = 0x10 BPF_ABS = 0x20 # alu/jmp fields BPF_JEQ = 0x10 BPF_K = 0x00 def bpf_jump(code, k, jt, jf): return pack('HBBI', code, jt, jf, k) def bpf_stmt(code, k): return bpf_jump(code, k, 0, 0) # Ordering of the filters is backwards of what would be intuitive for # performance reasons: the check that is most likely to fail is first. filters_list = [ # Must have dst port 67. Load (BPF_LD) a half word value (BPF_H) in # ethernet frame at absolute byte offset 36 (BPF_ABS). If value is equal to # 67 then do not jump, else jump 5 statements. bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 36), bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 67, 0, 5), # Must be UDP (check protocol field at byte offset 23) bpf_stmt(BPF_LD | BPF_B | BPF_ABS, 23), bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x11, 0, 3), # Must be IPv4 (check ethertype field at byte offset 12) bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12), bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x0800, 0, 1), bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass bpf_stmt(BPF_RET | BPF_K, 0), # reject ] # Create filters struct and fprog struct to be used by SO_ATTACH_FILTER, as # defined in linux/filter.h. filters = ''.join(filters_list) b = create_string_buffer(filters) mem_addr_of_filters = addressof(b) fprog = pack('HL', len(filters_list), mem_addr_of_filters) # As defined in asm/socket.h SO_ATTACH_FILTER = 26 # Create listening socket with filters s = socket(AF_PACKET, SOCK_RAW, 0x0800) s.setsockopt(SOL_SOCKET, SO_ATTACH_FILTER, fprog) s.bind(('eth0', 0x0800)) while True: data, addr = s.recvfrom(65565) print 'got data from', addr, ':', hexlify(data)
Friday, December 30, 2011
Raw sockets with BPF in Python
Update 2021: Note that this was written in 2011. Nowadays I'd not recommend doing it this way, but instead using BCC to write your filters in C, from within Python.
The following example shows the use of raw sockets where filtering is applied using BPF. It has only been tested on Linux.
The filter data structure is built up to form a machine language-like set of instructions that decide whether the packet should be passed to the raw socket or not. The filter is applied to the socket using SO_ATTACH_FILTER.
This specific example filters packets to only allow packets destined a DHCP server (UDP port 67).
Subscribe to:
Post Comments (Atom)
Nice job. But i wonder where you got the info about the programming of the bfp.
ReplyDeleteMaybe you could help. I just want to capture all frames with custom ethertype 0x7788 OR 0x7799,
How would the filter look like? How do you handle the OR condition?
Many thanks in advance
Alain
Thanks! Regarding where i got the info, if I remember right, then I found some inspiration in the source code of Scapy, and also by searching for SO_ATTACH_FILTER in general and reading examples.
ReplyDeleteTo achieve an OR, I think you could do something like
- Load the ethertype into the register
- Do a jump to a "return pass" if it's 0x7788
- Do a jump to the "return pass" if 0x7799
- "return reject" in case none of the jumps succeeded
- "return pass" (the jumps would jump to here if they succeeded)
Does it make sense?
Wow, that's a fast answer.
ReplyDeleteHere is my attempt:
# load proto(ethertype field at byte offset 12)
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12),
# CHECK IF ethertype== 0x7788
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7788, 0, 2),
# CHECK IF ethertype== 0x7799
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7799, 0, 1),
bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass
bpf_stmt(BPF_RET | BPF_K, 0), # reject
Is this correct?
Alain
I guess this is a better version.
ReplyDelete# load proto(ethertype field at byte offset 12)
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12),
# CHECK IF ethertype== 0x7788, if equal skip 2 statements
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7788, 2, 0),
# CHECK IF ethertype== 0x7799, if not equal skip 1 statement
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7799, 1, 0),
bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass
bpf_stmt(BPF_RET | BPF_K, 0), # reject
Many thanks
Alain
I got it working. For the posterity, here is my code:
ReplyDelete# load proto(ethertype field at byte offset 12)
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12),
# CHECK IF ethertype== 0x7788, if equal skip 2 statements
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7788, 1, 0),
# CHECK IF ethertype== 0x7799, if not equal skip 1 statement
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0xa799, 0, 1),
bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass
bpf_stmt(BPF_RET | BPF_K, 0), # reject
Many thanks again. This saved me a lot of time!
Alain
Sorry, pasted wrong code.
ReplyDeleteIt should be OK now.
# load proto(ethertype field at byte offset 12)
bpf_stmt(BPF_LD | BPF_H | BPF_ABS, 12),
# CHECK IF ethertype== 0x7788, if equal skip 1 statement
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7788, 1, 0),
# CHECK IF ethertype== 0x7799, if not equal skip 1 statement
bpf_jump(BPF_JMP | BPF_JEQ | BPF_K, 0x7799, 0, 1),
bpf_stmt(BPF_RET | BPF_K, 0x0fffffff), # pass
bpf_stmt(BPF_RET | BPF_K, 0), # reject
Alain
Glad you got it working :-)
ReplyDeletefwiw, you can use tcpdump -d and tcpdump -dd to create expressions to use, I find it much easier then using the language directly.
ReplyDeleteFor example, let's say you want to filter all icmp pings. I would use something like this:
export expression="icmp[icmptype] == icmp-echo";
paste -d"\n" \
<(tcpdump -d $expression | sed -e 's/^/# /') \
<(tcpdump -dd $expression |tr -s '{}' '()')
to generate the filter, which would look like:
# (000) ldh [12]
( 0x28, 0, 0, 0x0000000c ),
# (001) jeq #0x800 jt 2 jf 10
( 0x15, 0, 8, 0x00000800 ),
# (002) ldb [23]
( 0x30, 0, 0, 0x00000017 ),
# (003) jeq #0x1 jt 4 jf 10
( 0x15, 0, 6, 0x00000001 ),
# (004) ldh [20]
( 0x28, 0, 0, 0x00000014 ),
# (005) jset #0x1fff jt 10 jf 6
( 0x45, 4, 0, 0x00001fff ),
# (006) ldxb 4*([14]&0xf)
( 0xb1, 0, 0, 0x0000000e ),
# (007) ldb [x + 14]
( 0x50, 0, 0, 0x0000000e ),
# (008) jeq #0x8 jt 9 jf 10
( 0x15, 0, 1, 0x00000008 ),
# (009) ret #65535
( 0x6, 0, 0, 0x0000ffff ),
# (010) ret #0
( 0x6, 0, 0, 0x00000000 ),
put this in a list, like expression = [ ... ], and you can use something like:
blob = ctypes.create_string_buffer(
''.join(struct.pack("HBBI", *e) for e in expression))
address = ctypes.addressof(blob)
to_pass = struct.pack('HL', len(expression), address)
I was about to write about this on my blog, http://blog.yosti.net/, but seemed more useful here! Also, thanks for the interesting article.
Hi Mark. Cool stuff. I had read that people were generating the filter code blocks using tcpdump, but never got around to figuring out how in practice. Thanks for sharing!
ReplyDeleteThis is fantastic, thank you!
ReplyDeleteHowever, I'm having a slight issue. Python can't fine AF_PACKET. Is there an alternative I should be using?
Thanks!
-brian
Maybe the operating system you are using does not support this type of raw sockets. I was using Linux for this example. I see that on my Windows machines, I get the error "ImportError: cannot import name AF_PACKET", so I'm afraid this just dosn't work on Windows. I think on Windows you will have to go for something like libpcap (search the net for examples of this, I haven't tried using it). I hope you find a solution!
DeleteThis was on OpenBSD, which has bpf, so I'm surprised about the AF_PACKET thing. I'll do some research and see what I need to do. libpcap is probably more portable though, so I might just go with that.
DeleteThanks!!
-brian
This comment has been removed by the author.
ReplyDeleteHello Allan, Thank you for the post. How can I bind the socket on multiple interface? I tried "s.bind(('0.0.0.0', 0x0800))", but it is not working. Any help?
ReplyDeleteHello,Just want to share info, if anyone interested to sniff traffic on multiple port, just refer my question http://stackoverflow.com/questions/41340734/bpf-in-python-to-sniff-packets-for-multiple-tcp-ports
ReplyDeleteHello, here I have a warm tips to yours. If you want to get all the packet flow in your network, please don't bind the interface, or you will just get half of the packet. Because this is a raw socket, and a raw socket doesn't need to bind anything, it receive data from rawsocket not by interface.
ReplyDeleteIf I am wrong, please let be know, because the upon tips base in my development, if I bind the interface I just can get half of the packet, it doesn't fit my needs.
However this article is very inspired, thank your for your great job Allan. :)
Cool and I have a tremendous offer: Whole House Remodel Cost home addition builders near me
ReplyDelete