12 Office Hours Questions I Don’t Have Answers For

Last Updated 7 months ago

Hey, you! You look smart. Here are highly-upvoted questions that fellow readers submitted at https://pollgab.com/room/brento that I don’t have answers for. If you’d like to jump into the comments, you can reference questions by number with your answers. Time to show off your brains!

Kansas4444 asks: Hi Brent, do you have any advice on calculating checksum for an entire row on SQL Server ? (Brent says: I’ve never had to do this, so no idea.)
Margaret asks: Hi Brent — we are in the process of doing a lot of VLF reductions across the farm and I wanted to ask if issuing a CHECKPOINT command before trying to shrink the log was anything that might cause problems going forward?
Stanley Burrell: What is your opinion of Microsoft PerfInsights for load testing prospective new SQL Server VMs? (Brent says: never even heard of it.)
BamDBA asks: When using Transparent Data Encryption for databases in Availability Groups in a busy environment, are there any tips/best practices to increase performance?
I’m_Trying asks: Are you aware of any way to implement multifactor authentication for versions previous to SQL 2022 for on prem databases?
LCK_Tess_Monster: RE: MAXDOP=0 & Mastering Indexing – before moving to Azure, I saw intensive queries cause blocking to its internal threads in ActMon without causing problems for others. Under what circumstances could blocking internal to a SPID slow down IX creation or was it a false positive?
Ricardo: To refresh our SSRS server I want to move from backup /restore to Replication but am in 2 minds whether to replicate into the current database or a new one? A new one would be cleaner but we seem to have many reports with environmental requirements like hard-coded database names. (Brent says: this is the kind of thing I’d do a consulting engagement to really understand the requirements, available resources, budget, etc, but maybe some of y’all have free answers for them.)
DGW in OKC: What is your opinion of Azure AD? Does it work well with on-prem SQL Server security?
hamburger sandwich asks: How do you automate windows server OS updates for SQL Server high availability group clusters? (Brent says: this question comes up a lot, and I haven’t ever seen a good answer that didn’t involve possibly rolling back open transactions or backups, but maybe y’all have seen something I haven’t.)
Junior DBA wannabe asks: How would you script the creation of a database and add it to AG replication from a single node? Same for app login creation
HotPotatoJuggler asks: DBA says SQL Server DB wait states are negligable. Webserver has resource, and is barely moving the dial. Connection from webserver to DB <8ms. Running query <2ms. 300ms for the .net adapter to fill for each call (x20). AWS just old instance – do they slow them down? 🙂 Any clues? (Brent says: I’m not even sure what the hell this is asking.)
Frozt asks: Hi Brent, Do you have any gotcha’s we got to be aware regarding Always Encrypted with Enclaves? Also is this better than application level encryption?

Curious to see what y’all say!

Announcing SQLBits 2024: 19-23 March, Near London

[Video] Office Hours at Svínafellsjökull Glacier in Iceland

29 Comments. Leave new

Vlad Drumea
September 19, 2023 1:52 pm

1. The only method I can think of (but I’d like to dig more into this) is something like concatenating that record into a string variable and then applying HASHBYTES on that variable
Example using the Users table in the StackOverflow database (yes, I didn’t concatenate all the columns, but it should be enough for a PoC:
DECLARE @Record NVARCHAR(MAX);

SELECT @Record = CAST(ID AS NVARCHAR(10))
+ ISNULL(AboutMe, ”)
+ CAST(ISNULL(Age, 0) AS NVARCHAR(10))
+ CONVERT(NVARCHAR(23), CreationDate, 21)
+ DisplayName
+ CAST(ISNULL(DownVotes, 0) AS NVARCHAR(10))
+ ISNULL(EmailHash, ”)
+ CONVERT(NVARCHAR(23), LastAccessDate, 21)
FROM Users
WHERE id = 26837

SELECT HASHBYTES(‘SHA2_256’, @Record);

2. I’ve had to do this across production instances that were initially configured with very small TLog growth increments and o issues. The steps I used:
1. CHECKPOINT
2. TLog backup
3. Shrink
I strongly recommend doing this during a maintenance window if you’re doing it in Prod.

Reply
- Vlad Drumea
  September 19, 2023 2:31 pm
  
  For question I left out steps 4 and 5, by mistake:
  4. update the log file growth increments to more reasonable values in line with your workloads
  5. regrow the transaction logs.
  Sorry about that.
  
  Reply
- keith
  September 19, 2023 6:09 pm
  
  you need to add a separator in between each property, not an isnull to replace with string on the property. Otherwise you can end up with equivalent hash results for different combinations of nullable columns, columns with blanks.
  
  Reply
  - Vlad Drumea
    September 19, 2023 6:37 pm
    
    You are correct about needing a separator, my suggestion was just off the top of my head and I didn’t take that into account.
    ISNULL is still needed tho due to CONCAT_NULL_YIELDS_NULL.
    
    Reply
    - SQLNightOwl
      October 2, 2023 4:05 pm
      
      concat_ws should work for null issues without needing to address nullability.
      
      Reply
Vlad Drumea
September 19, 2023 2:07 pm

7. If you’re the DBA and SSRS report development doesn’t fall in your job description, I’d recommend the following:
1. getting in touch with the team(s) owning/designing the reports
2. present the situation to them (multiple reports, some with hardcoded connection parameters, etc.) – it comes in handy to have an Excel spreadsheet with report usage history for at least the past 30 days (you can get that from the ExecutionLog views)
3. Ask them to go over their reports and decide what they no longer need/use and what might need rewriting to avoid hardcoded connection data and just the Data Source feature that SSRS provides for easier management of connection info.
If they’re onboard with that then they can start creating the new reports in the new/clean database.

Reply
- Vlad Drumea
  September 19, 2023 2:16 pm
  
  I accidentally missed the word “use” from here
  “…hardcoded connection data and just the Data Source feature that SSRS….”
  it should be
  “…hardcoded connection data and just use the Data Source feature that SSRS…”
  
  Reply
Vlad Drumea
September 19, 2023 2:29 pm

10. I think you should be able to do this using the dbatools PowerShell module, specifically the New-DbaDatabase and Add-DbaAgDatabase commands.
https://docs.dbatools.io/

Reply
marcscirri
September 19, 2023 3:01 pm

5.) Assuming your SQL Server is behind your firewall and not exposed to the public, your best bet is to enforce MFA for network access. Once someone is authenticated on the network, you can feel comfortable it is really them who are then accessing network resources. If you have public facing resources that you want to protect, that is a different story.

Reply
- Brent Ozar
  September 19, 2023 3:04 pm
  
  Marc – that doesn’t fix the problem of individuals accessing someone else’s account, hacked workstations inside the network, etc.
  
  Reply
  - marcscirri
    September 19, 2023 4:12 pm
    
    But if they hacked network MFA for a network account, they would be able to hack MFA when logging into SQL Server (compromised email account or mobile device for 2nd factor tokens). Unless the folks logging into SQL Server aren’t using network accounts and are using SQL Server logins.
    
    Reply
Joe Bonomo
September 19, 2023 3:11 pm

8. Havent had any issues using Azure AD as the authority for on-premises SQL servers. Takes a bit to setup I believe as I was not the one to do it but I know we have this working.

Reply
Ben Thul
September 19, 2023 3:46 pm

For Q1, I’ve used the approach outlined in https://blog.greglow.com/2018/07/02/sql-finding-rows-that-have-changed-in-t-sql-checksum-binary_checksum-hashbytes/ to create a calculated column on the table. And specifically why I did it was to facilitate a data migration from one table to another (in some cases, separate database; in others, changing structure). In those cases, I’d create an index on ID include chk to easily determine whether a given row was different between source and target.

Reply
- Jessica
  September 19, 2023 4:17 pm
  
  This with a computed column is how I would attack it for a table (persisted or not depending on specific use case).
  
  The other way is make the data hashing an application layer problem and just have a column for the hash.
  
  Depending on use case, MD5 is much quicker than sha_256 and still useful for validation when hash collisions are low impact. For most “did the data change or is it the same” applications where security needs are minimal, md5 is still valid with a birthday paradox hash collision at 2^64 hashes, Which is the max on an unsigned big int.
  
  Reply
Ben Miller
September 19, 2023 4:45 pm

4. When using TDE on any database even in an AG takes overhead that you are really not going to be able to affect since it is behind the scenes encrypting and decrypting data. The only tip for performance is to ensure that SQL is only reading what it needs to read, so it is only decrypting as little as possible. Remember that this is a general tip for performance for any database, it is just very effective in TDE environments.

Reply
- Gavin Harris
  September 26, 2023 11:26 am
  
  Microsoft did a technical review of the performance overhead for TDE and measured it as an extra 3-5% of CPU effort. If you find adding TDE to your system has introduced measurable performance issues, then you were probably running your server pretty ragged to begin with. At this point you have two options. Add more CPU, with the associated licence costs but it may be a quick solution while you look at option 2, a detailed performance tuning project.
  
  Reply
Kim Crosser
September 19, 2023 4:47 pm

Regarding computing a checksum on an entire row.
SQL has a built-in “CHECKSUM” function that will return an INT checksum against an entire row. However, it has several flaws that can cause duplicate checksums for different data. (See the Microsoft documentation on CHECKSUM – https://learn.microsoft.com/en-us/sql/t-sql/functions/checksum-transact-sql?view=sql-server-ver16).
The “BINARY_CHECKSUM” function doesn’t have the same problems, but it has different problems – it can’t process some data types (https://learn.microsoft.com/en-us/sql/t-sql/functions/binary-checksum-transact-sql?view=sql-server-ver16).
There is also a HASHBYTES function (https://learn.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql?view=sql-server-ver16). This takes an input (varchar, nvarchar, or varbinary) and returns a “hashed” varbinary(8000) value, but you have to get the source row data into a single string object first.
The “simplest” way that I get a “good” checksum is to use the BINARY_CHECKSUM function, but I use “select * from Table as XML PATH(”)” to return the row data as a string, which the BINARY_CHECKSUM function can then operate on to produce an INT checksum.
Example:
select BINARY_CHECKSUM((select top 1 * from MyTable for XML path(”)))
This may not be the most efficient at run time, but it is very easy to code.

Reply
Thiago Timm
September 19, 2023 5:04 pm

1. Kansas4444 asks: Hi Brent, do you have any advice on calculating checksum for an entire row on SQL Server ? (Brent says: I’ve never had to do this, so no idea.)

SELECT CHECKSUM(*)
FROM table_name

SELECT CHECKSUM(column_name_1, column_name_2, …)
FROM table_name

Reply
Brad
September 19, 2023 5:11 pm

5. You can use third party software like Symantec’s Privileged Access Manager (PAM for short.) You can use MFA to access PAM and check out user admin accounts for your SQL boxes. The timeout for the password can be set to a time of your choosing and puts a layer between the admin users and SQL Server. Better have a really good DR plan for PAM though.

Reply
- Vlad Drumea
  September 19, 2023 6:01 pm
  
  Isn’t PAM just for OS level users that are local admin on VMs (pretty similar to Thycotic)?
  I think the question was for MFA supprt when authenticating to the SQL Server instance (either via SQL authentication or Windows authentication), not on the host on which the instance resides.
  
  Reply
  - Brad
    September 19, 2023 6:19 pm
    
    The idea is the OS level users don’t have direct access to SQL or the VMs without going through PAM. I’ve never heard of Thycotic so I can’t compare. Each DBA will have their domain account and a domain PAM account. Create an AD group of only DBA’s grant that group access to SQL and the VMs, have two factor auth to PAM and there is your layer of separation of user accounts. Although it will be required to copy the password from PAM and perform a ‘run as’ to use SSMS directly to the instance, it provides a higher level of security than out of the box SQL because PAM controls the password, periodic password changes and complexity. Granted this is not perfect, but if the intent is to separate DBA user accounts from SQL, it fits the bill.
    
    Reply
Koritt
September 19, 2023 5:20 pm

9. Our server admin team uses System Center Configuration Manager (now called Microsoft Endpoint Configuration Manager, I believe) to automatically push OS updates to the SQL servers. They defined an AD Group for active SQL servers that SCCM will need to explicitly exclude and the DBA team uses PowerShell cmdlets (Get-ADObject, Get-ADComputer, Get-ADGroup, Remove-ADGroupMember, and Add-ADGroupMember) to populate that group with the AD Computer objects for all active cluster nodes.

All passive nodes are patched in the week preceding the monthly maintenance downtime, the downtime window is used to failover cluster instances and availability groups since any failover risks transaction rollbacks, and then the formerly-active-but-now-passive nodes are patched in a subsequent week.

Although you could script the failovers via PowerShell if you wanted to automate the entire process end-to-end, we prefer to have a DBA involved in planned service disruptions for additional safety and validation.

Reply
Peter Sawatzki
September 19, 2023 6:53 pm

1. this is how I calculate hash values for whole rows very successful ever since:

select
ht.id,
ht.HashValue
FROM
(select c.Id Id,
HASHBYTES(‘SHA1’,(select customers.* FROM (values(null))foo(bar) for xml auto)) HashValue FROM customers c
) ht

Reply
- Peter Sawatzki
  September 19, 2023 6:56 pm
  
  small correction:
  select
  ht.id,
  ht.HashValue
  FROM
  (select c.Id Id,
  HASHBYTES(‘SHA1’,(select c.* FROM (values(null))foo(bar) for xml auto)) HashValue FROM customers c
  ) ht
  
  Reply
Bearded Bard
September 19, 2023 9:02 pm

@HotPotatoJuggler – Your DBA is correct. The database isn’t your bottleneck. You seem to be benchmarking the response time of a webpage from inside the browser’s built-in dev tools. You answered your own question by finding that “.NET adapter” (sic) is making *TWENTY* db queries, each of which spends about one third of a second. This is your target. Drill deeper here.

Bonus Tip: Bone up on the following SQL syntax to get the 20 rows you’re after in a single query (if you’re causing your own problem by making 20 db queries in a loop)

SELECT … ORDER BY … OFFSET x ROWS FETCH NEXT y ROWS ONLY

Hope this helps! Cheers!

Reply
- Bearded Bard
  September 19, 2023 9:04 pm
  
  (#11) ^^^^
  
  Reply
JCas
September 19, 2023 9:18 pm

TDE prevents instant file initialization, this affects DB data file growths, so size the DB correctly and ensure growths of an appropriate size are done at appropriate times to minimize impact.

https://learn.microsoft.com/en-us/sql/relational-databases/databases/database-instant-file-initialization?view=sql-server-ver16

Reply
Frank the Datenpunk
September 19, 2023 10:05 pm

Question 9
Makes sure that primary/secondary node never run on same host. CAU takes over the rest of the Work!

Reply
Ildjarn
September 20, 2023 6:07 am

9. There is a cluster aware updating powershell module, perhaps that can help
https://learn.microsoft.com/en-us/powershell/module/clusterawareupdating/?view=windowsserver2022-ps

Reply