I have multiple Exchange 2013/Windows 2012 DAGs that sit behind 2 CAS servers in a Windows NLB. My production DAG is fine, but my pre-release DAG is now getting this error, so I imagine something on the 2 DAG member servers is causing it. I think something changed in IIS by accident by one of my admins, but I've been unable to find out who or what it was. Troubleshooting is hitting a wall too.
The ECP and OWA to mailboxes on my other DAG work fine. I have legacy redirection enabled to my Exchange 2007 environment too, and when somebody logs into OWA via my 2013 CAS servers, that redirects them fine as well. EAS, RCP over HTTPS, and autodiscover all appear fine to the problem DAG. Testing email autoconfiguration works fine to my problem DAG as well, so it appears to be OWA related.
The symptoms:
- After logging into OWA on mailboxes in the effected DAG, a "something went wrong" screen appears.
An unexpected error occurred and your request couldn't be handled.
X-OWA-Error: System.NullReferenceException X-OWA-Version: 15.0.775.32 X-FEServer: {My 2013 CAS} X-BEServer: {Mailbox server hosting that DB} Date: 7/9/2014 6:36:37 PM
- No information appears in the windows logs. Searching the HTTPERR logs also showed nothing. W3SVC logs show the following events. Are these 302 errors?
CAS Server -
2014-07-09 18:11:20 10.128.13.38 POST /owa/auth.owa &cafeReqId=2d57ab13-496b-4ab3-b6c8-ecbd8061357c; 443 domain\username 172.30.108.17 Mozilla/5.0+(Windows+NT+6.1;+WOW64;+rv:30.0)+Gecko/20100101+Firefox/30.0 https://mail.domain.com/owa/auth/logon.aspx?replaceCurrent=1&url=https%3a%2f%2fmail.domain.com%2fowa%2f 302 0 0 0
Mailbox Server -
2014-07-09 18:11:22 10.128.13.76 GET /owa/default.aspx &ActID=10154033-ee4d-42bc-be5b-8efc0055c54f&ex=E404 444 domain\user 10.128.13.68 Mozilla/5.0+(Windows+NT+6.1;+WOW64;+rv:30.0)+Gecko/20100101+Firefox/30.0 https://mail.domain.com/owa/auth/logon.aspx?replaceCurrent=1&url=https%3a%2f%2fmail.domain.com%2fowa%2f 302 0 0 79
- It appears that while the ECP is working fine, it has problems getting information from the DAG. For instance, on the database tab in servers management, the "active on server" doesn't change even when failing the database back and forth.
All cluster resources are up. All agents appear fine. The OWA app pool is started and appears the same as the OWA app pool on my working DAG.
I've already tried the killbit fix. There was no killbit on my CAS servers, but one existed on the DAG mailbox servers. I deleted them out of one DAG member, failed everything over to there, and tried OWA. It still didn't work.
Thanks in advance. Let me know if you need any more info.