pfatt icon indicating copy to clipboard operation
pfatt copied to clipboard

Intel igb/em Interfaces Broken on 2.6/22.01+

Open ChronicledMonocle opened this issue 3 years ago • 166 comments

The dhcp lease for connections is not handed through to the ngeth0 interface properly. There isn't any real "errors" in the logs.

If you try to run the script manually after boot you get "ngctl: send msg: File exists"

Logs from pfatt.log:

2022-02-14 14:36:56 :: [pfatt.sh] :: pfSense + AT&T U-verse Residential Gateway for true bridge mode 2022-02-14 14:36:56 :: [pfatt.sh] :: Configuration: 2022-02-14 14:36:56 :: [pfatt.sh] :: ONT_IF: igb0 2022-02-14 14:36:56 :: [pfatt.sh] :: RG_IF: igb1 2022-02-14 14:36:56 :: [pfatt.sh] :: RG_ETHER_ADDR: [MY MAC HERE] 2022-02-14 14:36:56 :: [pfatt.sh] :: attaching interfaces to ng_ether... OK! 2022-02-14 14:36:56 :: [pfatt.sh] :: building netgraph nodes... 2022-02-14 14:36:56 :: [pfatt.sh] :: creating ng_one2many... 2022-02-14 14:37:00 :: [pfatt.sh] :: pfSense + AT&T U-verse Residential Gateway for true bridge mode

I am not running wpa_supplicant mode.

ChronicledMonocle avatar Feb 14 '22 20:02 ChronicledMonocle

Can confirm its not working after an upgrade. Following the troubleshooting instructions shows that the modules have loaded. PFatt logs dont show anything.

Grassyloki avatar Feb 14 '22 22:02 Grassyloki

Can confirm it is broken for me as well, running supplicant

bigjohns97 avatar Feb 14 '22 23:02 bigjohns97

Same issue. Not grabbing DHCP.

Edit: I am using the WPA supplicant method.

neydah700 avatar Feb 15 '22 02:02 neydah700

Want to confirm that reverting to 2.5.2 or 21.05.2 immediately restores internet for me after setting everything back up.

ChronicledMonocle avatar Feb 15 '22 06:02 ChronicledMonocle

Want to confirm that reverting to 2.5.2 or 21.05.2 immediately restores internet for me after setting everything back up.

Yes, It was an absolute pain in the a**, but restoring to 21.05.2 immediately fixed it for me. IPv6 wouldn't grab for an hour or so but finally started working.

neydah700 avatar Feb 15 '22 06:02 neydah700

Also, I posted on the Netgate Forums. If anyone else wants to add anything over there here is the link. https://forum.netgate.com/topic/169882/22-01-2-6-0-upgrade-broke-dhcp-on-wan-interface-with-custom-startup-script

neydah700 avatar Feb 15 '22 06:02 neydah700

I am having the same problem and now my WireGuard and other tools don't work and can't get them to work.

SGC1990 avatar Feb 16 '22 13:02 SGC1990

Yep - supplicant not working for me either. The last time a new version of pfsense broke pfatt Matt Johnson submitted this issue to pfsense redmine. Should we do that here? Here is the issue that originated the whole thing.

grevelle avatar Feb 17 '22 02:02 grevelle

It also broke mine after update. Per the docs, I ran "tcpdump -ei ONT_IF" and "tcpdump -ei RG_IF", which should filter and capture link layer information (2), on my interfaces and captured 0 packets from RG_IP and only the bridged DHCP traffic on the ONT_IF interface.

I reset netgraph, which removes the hooks, rebooted the gateway and modem with tcpdump running and captured 0 packets from the interfaces. Before removing the netgraph hooks, the only traffic I seen on any of the three interfaces, was the DHCP request on the ngeth0 virtual interaface, and the bridged ONT_IF interface. So the DHCP requests are still getting to the correct interface.

The fact that tcpdump doesn't see any traffic makes me think its being filtered, like promisc mode isn't allowing EAPOL 802.1X traffic to be capture, and there fore is not bridged. No authentication mean no DHCP response. IMO

I've moved to inline behind the gateway until this can be figured out. I would be willing to test once a day.

bigg1969 avatar Feb 17 '22 03:02 bigg1969

Okay, had some success today based on info I gathered from all the various discussions online. I think it is something to do with the em(4) driver. Do all of you having issues have Intel NIC's? I put together a test pfSense server from a bunch of spare parts and it worked right away on the latest release. After digging, I couldn't get any Intel NIC to work. Using what I had around (a few crappy USB dongles worked and old PC's with integrated NICs) I had success with everything not Intel GbE. When I re-upgraded my main pfSense box I was able to move my WAN link to an SFP slot (with RJ45 Module) with some success. I say "some" because all my SFP/RJ45 modules are 10GB and they do not negotiate well with the ONT.

Something interesting for me, if_em.ko is present in /boot/kernel on 2.6.0 but wasn't in my previous version of pfSense. My knowledge is limited but I am not sure where the driver was located in the previous version? Anyone smarter than me know?

Some Useful Links: FreeBSD 12.3 Release Notes (em(4) driver notes) - https://www.freebsd.org/releases/12.3R/relnotes/ Reddit Discussion - https://www.reddit.com/r/PFSENSE/comments/ssgsha/psa_260_breaks_att_bypass/?sort=new Netgate Forum Discussion - https://forum.netgate.com/topic/99190/att-uverse-rg-bypass-0-2-btc/396?_=1644931323812 OPNSense GIT Issue - https://github.com/MonkWho/pfatt/issues/65

neydah700 avatar Feb 17 '22 08:02 neydah700

I think this is going somewhere because I've tried multiple different boxes but they're all Intel Nics, when I get off work I will try a couple USB dongle's to see if it gets traffic that way.

SGC1990 avatar Feb 17 '22 09:02 SGC1990

I think this is going somewhere because I've tried multiple different boxes but they're all Intel Nics, when I get off work I will try a couple USB dongle's to see if it gets traffic that way.

The USBs work for me but are slow. Download is like 100m, upload is better at around 400m. I have a 1G SFP that should get here tomorrow. Really hoping that talks better with the ONT then the 10G did.

neydah700 avatar Feb 17 '22 09:02 neydah700

For USBs to work at 1 gig speeds you have to have 3.1 USB port or better. For FreeBSD, I am using a box equivalent to the netgate 1541 Same everything but a lot more powerful. Let me know how it goes with the other Nics.

SGC1990 avatar Feb 17 '22 09:02 SGC1990

For USBs to work at 1 gig speeds you have to have 3.1 USB port or better. I am using a box equivalent to the netgate 1541 Same everything but a lot more powerful. Let me know how it goes with the other Nics.

Will do! If it helps I'm using the XG-1537 so USB3.0

neydah700 avatar Feb 17 '22 09:02 neydah700

Is the usb dongles 3.0, when I was using usb in past it worked great I was able to get full 1gb speeds out of my usb ports. If the usb is 3.0 then I don't know why I am getting full 1gb speeds. But I did downgrade back to 2.5.2 now WireGuard don't work on 2.5.2.

SGC1990 avatar Feb 17 '22 10:02 SGC1990

Is the usb dongles 3.0

Yep!

neydah700 avatar Feb 17 '22 10:02 neydah700

Okay, had some success today based on info I gathered from all the various discussions online. I think it is something to do with the em(4) driver.

Nothing useful to add here but I can confirm I'm using an Intel NIC with the em driver. Neither tethered or supplicant working for me on 22.1 but supplicant is working on 21.7.8

em0: <Intel(R) 82583V> port 0xe000-0xe01f mem 0xdf500000-0xdf51ffff,0xdf520000-0xdf523fff irq 16 at device 0.0 on pci1 em1: <Intel(R) 82583V> port 0xd000-0xd01f mem 0xdf400000-0xdf41ffff,0xdf420000-0xdf423fff irq 17 at device 0.0 on pci2 em2: <Intel(R) 82583V> port 0xc000-0xc01f mem 0xdf300000-0xdf31ffff,0xdf320000-0xdf323fff irq 18 at device 0.0 on pci3 em3: <Intel(R) 82583V> port 0xb000-0xb01f mem 0xdf200000-0xdf21ffff,0xdf220000-0xdf223fff irq 19 at device 0.0 on pci4 em4: <Intel(R) 82583V> port 0xa000-0xa01f mem 0xdf100000-0xdf11ffff,0xdf120000-0xdf123fff irq 16 at device 0.0 on pci5 em5: <Intel(R) 82583V> port 0x9000-0x901f mem 0xdf000000-0xdf01ffff,0xdf020000-0xdf023fff irq 17 at device 0.0 on pci6

I'm on a Protectli FW6D

MrCaturdayNight avatar Feb 17 '22 12:02 MrCaturdayNight

I am using an Intel NIC but with the IGB driver.

bigjohns97 avatar Feb 17 '22 16:02 bigjohns97

I am using an Intel NIC but with the IGB driver.

And is it working or not because my system is using igb drivers too and mine is not working

SGC1990 avatar Feb 17 '22 17:02 SGC1990

My knowledge on FreeBSD is limited but I believe igb uses the em(4) driver. All the common Intel cards fall under it (I350, 82575, etc.)

https://www.freebsd.org/releases/12.3R/hardware/

neydah700 avatar Feb 17 '22 17:02 neydah700

I am using an Intel NIC but with the IGB driver.

And is it working or not because my system is using igb drivers too and mine is not working

Not working

bigjohns97 avatar Feb 17 '22 17:02 bigjohns97

If you look at the if_igb.ko driver in /boot/kernel it just is a shortcut to if_em.ko. I think at one point the two intel drivers merged. https://www.intel.com/content/www/us/en/download/15187/intel-network-adapter-gigabit-base-driver-for-freebsd.html?wapkw=i350%20freebsd

neydah700 avatar Feb 17 '22 17:02 neydah700

Okay, I got everything up and working on my regular Intel NIC. I’m not the biggest expert here so bear with me.

Through troubleshooting I was able to get every non-Intel NIC to authenticate and pull DHCP. After more testing all igb(4) driver-based cards failed. In the /boot/kernel folder I noticed if_igb.ko is simply a shortcut to the em(4) driver (if_em.ko). I am guessing FreeBSD is using this combined driver from intel? https://www.intel.com/content/www/us/en/download/15187/intel-network-adapter-gigabit-base-driver-for-freebsd.html

Alternatively, I found this driver that appears to be for igb(4) separately, and it seems newer. https://www.intel.com/content/www/us/en/download/14610/intel-network-adapter-driver-for-82575-6-and-82580-based-gigabit-network-connections-under-freebsd.html?wapkw=i350%20freebsd

I downloaded a FreeBSD-12.3 VM, its related source code (amd64), and complied the separate igb(4) driver.

I loaded my newly compiled if_igb.ko into the /boot/modules folder with chmod 555 permissions. Next, I added the following two lines to my /boot/loader.conf file to supersede the included driver.

if_igb_load="YES" if_igb_name="/boot/modules/if_igb.ko"

Rebooted and everything came up just fine!

Feel free to use my compiled if_igb.ko if you don’t want to build your own. https://github.com/neydah700/pfsense_intel/blob/main/if_igb.ko

Also, for reference here is my pfatt script if anyone needs a reference. https://github.com/neydah700/pfsense_intel/blob/main/pfatt_intel.sh

A few notes:

  1. When I clean installed 2.6.0 (and 22.01 on my pfSense+ Box) absolutely nothing I did allowed my pfatt script to runs successfully from the /cf/conf directory. I ended up moving it to /root/pfatt and everything worked. This seemed to only be an issue once I moved to a ZFS file system but who knows.
  2. I have an angry family since our internet has been up and down for a few days now.

neydah700 avatar Feb 17 '22 19:02 neydah700

Interesting that the intel igb driver works. I searched for bugs on the FreeBSD buglist and found this...

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260068

Looks like it might be related? Issues with vlan tagging. Was introduced in 13.0 and 12.3... recently fixed in the stable branches, so the timing lines up.

lnxsrt avatar Feb 17 '22 20:02 lnxsrt

Some comments and feedback in testing so far:

  1. It seems safe to install and test this on 2.5.2. I have downloaded the kernel module and am testing prior to any updates. I haven't managed to break 2.5.2... yet.

  2. It would be better to create /boot/loader.conf.local instead of /boot/loader.conf. Loader.conf may be overwritten by pfsense updates.

  3. What is your output on 2.6.0 with the if_igb.ko module for "kldstat -v"? I can't confirm it is loading and in use on 2.5.2. I am reluctant to upgrade until I can validate it is loading.

jasonsansone avatar Feb 17 '22 21:02 jasonsansone

Interesting that the intel igb driver works. I searched for bugs on the FreeBSD buglist and found this...

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260068

Looks like it might be related? Issues with vlan tagging. Was introduced in 13.0 and 12.3... recently fixed in the stable branches, so the timing lines up.

Could explain why we are passing 802.1x not pulling DHCP on VLAN 0. I'll add it to my redmine issue on pfSense. If anyone else has success can they go on and comment. Hopefully we get some traction! https://redmine.pfsense.org/issues/12821?next_issue_id=12820

neydah700 avatar Feb 17 '22 21:02 neydah700

I am testing now reimaging since wiregraud is broke in my install right now.

SGC1990 avatar Feb 17 '22 21:02 SGC1990

i am testing now reimaging since wiregraud is broke in my install right now.

Interesting that the intel igb driver works. I searched for bugs on the FreeBSD buglist and found this... https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260068 Looks like it might be related? Issues with vlan tagging. Was introduced in 13.0 and 12.3... recently fixed in the stable branches, so the timing lines up.

Could explain why we are passing 802.1x not pulling DHCP on VLAN 0. I'll add it to my redmine issue on pfSense. If anyone else has success can they go on and comment. Hopefully we get some traction! https://redmine.pfsense.org/issues/12821?next_issue_id=12820

will do internet going out for a bit to update and bring system online.

SGC1990 avatar Feb 17 '22 21:02 SGC1990

Some comments and feedback in testing so far:

  1. It seems safe to install and test this on 2.5.2. I have downloaded the kernel module and am testing prior to any updates. I haven't managed to break 2.5.2... yet.
  2. It would be better to create /boot/loader.conf.local instead of /boot/loader.conf. Loader.conf may be overwritten by pfsense updates.
  3. What is your output on 2.6.0 with the if_igb.ko module for "kldstat -v"? I can't confirm it is loading and in use on 2.5.2. I am reluctant to upgrade until I can validate it is loading.

Good point on the .local, will adjust that.

For my kldstat does just this portion work for ya or do you want the whole output?

3 1 0xffffffff83cfb000 35e08 if_igb.ko (/boot/modules/if_igb.ko) Contains modules: Id Name 2 pci/igb

neydah700 avatar Feb 17 '22 21:02 neydah700

Good point on the .local, will adjust that.

For my kldstat does just this portion work for ya or do you want the whole output?

3 1 0xffffffff83cfb000 35e08 if_igb.ko (/boot/modules/if_igb.ko) Contains modules: Id Name 2 pci/igb

Thank you. That is what I was curious about. It isn't loading on 2.5.2, but it may just be because it was compiled for a different kernel. I also can't get into load with kldload on 2.5.2.

jasonsansone avatar Feb 17 '22 21:02 jasonsansone