12 Office Hours Questions I Don’t Have Answers For

SQL Server
29 Comments

Hey, you! You look smart. Here are highly-upvoted questions that fellow readers submitted at https://pollgab.com/room/brento that I don’t have answers for. If you’d like to jump into the comments, you can reference questions by number with your answers. Time to show off your brains!

  1. Kansas4444 asks: Hi Brent, do you have any advice on calculating checksum for an entire row on SQL Server ? (Brent says: I’ve never had to do this, so no idea.)
  2. Margaret asks: Hi Brent — we are in the process of doing a lot of VLF reductions across the farm and I wanted to ask if issuing a CHECKPOINT command before trying to shrink the log was anything that might cause problems going forward?
  3. Stanley Burrell: What is your opinion of Microsoft PerfInsights for load testing prospective new SQL Server VMs? (Brent says: never even heard of it.)
  4. BamDBA asks: When using Transparent Data Encryption for databases in Availability Groups in a busy environment, are there any tips/best practices to increase performance?
  5. I’m_Trying asks: Are you aware of any way to implement multifactor authentication for versions previous to SQL 2022 for on prem databases?
  6. LCK_Tess_Monster: RE: MAXDOP=0 & Mastering Indexing – before moving to Azure, I saw intensive queries cause blocking to its internal threads in ActMon without causing problems for others. Under what circumstances could blocking internal to a SPID slow down IX creation or was it a false positive?
  7. Ricardo: To refresh our SSRS server I want to move from backup /restore to Replication but am in 2 minds whether to replicate into the current database or a new one? A new one would be cleaner but we seem to have many reports with environmental requirements like hard-coded database names. (Brent says: this is the kind of thing I’d do a consulting engagement to really understand the requirements, available resources, budget, etc, but maybe some of y’all have free answers for them.)
  8. DGW in OKC: What is your opinion of Azure AD? Does it work well with on-prem SQL Server security?
  9. hamburger sandwich asks: How do you automate windows server OS updates for SQL Server high availability group clusters? (Brent says: this question comes up a lot, and I haven’t ever seen a good answer that didn’t involve possibly rolling back open transactions or backups, but maybe y’all have seen something I haven’t.)
  10. Junior DBA wannabe asks: How would you script the creation of a database and add it to AG replication from a single node? Same for app login creation
  11. HotPotatoJuggler asks: DBA says SQL Server DB wait states are negligable. Webserver has resource, and is barely moving the dial. Connection from webserver to DB <8ms. Running query <2ms. 300ms for the .net adapter to fill for each call (x20). AWS just old instance – do they slow them down? 🙂 Any clues? (Brent says: I’m not even sure what the hell this is asking.)
  12. Frozt asks: Hi Brent, Do you have any gotcha’s we got to be aware regarding Always Encrypted with Enclaves? Also is this better than application level encryption?

Curious to see what y’all say!

Previous Post
Announcing SQLBits 2024: 19-23 March, Near London
Next Post
[Video] Office Hours at Svínafellsjökull Glacier in Iceland

29 Comments. Leave new

  • 1. The only method I can think of (but I’d like to dig more into this) is something like concatenating that record into a string variable and then applying HASHBYTES on that variable
    Example using the Users table in the StackOverflow database (yes, I didn’t concatenate all the columns, but it should be enough for a PoC:
    DECLARE @Record NVARCHAR(MAX);

    SELECT @Record = CAST(ID AS NVARCHAR(10))
    + ISNULL(AboutMe, ”)
    + CAST(ISNULL(Age, 0) AS NVARCHAR(10))
    + CONVERT(NVARCHAR(23), CreationDate, 21)
    + DisplayName
    + CAST(ISNULL(DownVotes, 0) AS NVARCHAR(10))
    + ISNULL(EmailHash, ”)
    + CONVERT(NVARCHAR(23), LastAccessDate, 21)
    FROM Users
    WHERE id = 26837

    SELECT HASHBYTES(‘SHA2_256’, @Record);

    2. I’ve had to do this across production instances that were initially configured with very small TLog growth increments and o issues. The steps I used:
    1. CHECKPOINT
    2. TLog backup
    3. Shrink
    I strongly recommend doing this during a maintenance window if you’re doing it in Prod.

    Reply
    • For question I left out steps 4 and 5, by mistake:
      4. update the log file growth increments to more reasonable values in line with your workloads
      5. regrow the transaction logs.
      Sorry about that.

      Reply
    • you need to add a separator in between each property, not an isnull to replace with string on the property. Otherwise you can end up with equivalent hash results for different combinations of nullable columns, columns with blanks.

      Reply
      • You are correct about needing a separator, my suggestion was just off the top of my head and I didn’t take that into account.
        ISNULL is still needed tho due to CONCAT_NULL_YIELDS_NULL.

        Reply
  • 7. If you’re the DBA and SSRS report development doesn’t fall in your job description, I’d recommend the following:
    1. getting in touch with the team(s) owning/designing the reports
    2. present the situation to them (multiple reports, some with hardcoded connection parameters, etc.) – it comes in handy to have an Excel spreadsheet with report usage history for at least the past 30 days (you can get that from the ExecutionLog views)
    3. Ask them to go over their reports and decide what they no longer need/use and what might need rewriting to avoid hardcoded connection data and just the Data Source feature that SSRS provides for easier management of connection info.
    If they’re onboard with that then they can start creating the new reports in the new/clean database.

    Reply
    • I accidentally missed the word “use” from here
      “…hardcoded connection data and just the Data Source feature that SSRS….”
      it should be
      “…hardcoded connection data and just use the Data Source feature that SSRS…”

      Reply
  • 10. I think you should be able to do this using the dbatools PowerShell module, specifically the New-DbaDatabase and Add-DbaAgDatabase commands.
    https://docs.dbatools.io/

    Reply
  • 5.) Assuming your SQL Server is behind your firewall and not exposed to the public, your best bet is to enforce MFA for network access. Once someone is authenticated on the network, you can feel comfortable it is really them who are then accessing network resources. If you have public facing resources that you want to protect, that is a different story.

    Reply
    • Marc – that doesn’t fix the problem of individuals accessing someone else’s account, hacked workstations inside the network, etc.

      Reply
      • But if they hacked network MFA for a network account, they would be able to hack MFA when logging into SQL Server (compromised email account or mobile device for 2nd factor tokens). Unless the folks logging into SQL Server aren’t using network accounts and are using SQL Server logins.

        Reply
  • 8. Havent had any issues using Azure AD as the authority for on-premises SQL servers. Takes a bit to setup I believe as I was not the one to do it but I know we have this working.

    Reply
  • For Q1, I’ve used the approach outlined in https://blog.greglow.com/2018/07/02/sql-finding-rows-that-have-changed-in-t-sql-checksum-binary_checksum-hashbytes/ to create a calculated column on the table. And specifically why I did it was to facilitate a data migration from one table to another (in some cases, separate database; in others, changing structure). In those cases, I’d create an index on ID include chk to easily determine whether a given row was different between source and target.

    Reply
    • This with a computed column is how I would attack it for a table (persisted or not depending on specific use case).

      The other way is make the data hashing an application layer problem and just have a column for the hash.

      Depending on use case, MD5 is much quicker than sha_256 and still useful for validation when hash collisions are low impact. For most “did the data change or is it the same” applications where security needs are minimal, md5 is still valid with a birthday paradox hash collision at 2^64 hashes, Which is the max on an unsigned big int.

      Reply
  • 4. When using TDE on any database even in an AG takes overhead that you are really not going to be able to affect since it is behind the scenes encrypting and decrypting data. The only tip for performance is to ensure that SQL is only reading what it needs to read, so it is only decrypting as little as possible. Remember that this is a general tip for performance for any database, it is just very effective in TDE environments.

    Reply
    • Microsoft did a technical review of the performance overhead for TDE and measured it as an extra 3-5% of CPU effort. If you find adding TDE to your system has introduced measurable performance issues, then you were probably running your server pretty ragged to begin with. At this point you have two options. Add more CPU, with the associated licence costs but it may be a quick solution while you look at option 2, a detailed performance tuning project.

      Reply
  • Regarding computing a checksum on an entire row.
    SQL has a built-in “CHECKSUM” function that will return an INT checksum against an entire row. However, it has several flaws that can cause duplicate checksums for different data. (See the Microsoft documentation on CHECKSUM – https://learn.microsoft.com/en-us/sql/t-sql/functions/checksum-transact-sql?view=sql-server-ver16).
    The “BINARY_CHECKSUM” function doesn’t have the same problems, but it has different problems – it can’t process some data types (https://learn.microsoft.com/en-us/sql/t-sql/functions/binary-checksum-transact-sql?view=sql-server-ver16).
    There is also a HASHBYTES function (https://learn.microsoft.com/en-us/sql/t-sql/functions/hashbytes-transact-sql?view=sql-server-ver16). This takes an input (varchar, nvarchar, or varbinary) and returns a “hashed” varbinary(8000) value, but you have to get the source row data into a single string object first.
    The “simplest” way that I get a “good” checksum is to use the BINARY_CHECKSUM function, but I use “select * from Table as XML PATH(”)” to return the row data as a string, which the BINARY_CHECKSUM function can then operate on to produce an INT checksum.
    Example:
    select BINARY_CHECKSUM((select top 1 * from MyTable for XML path(”)))
    This may not be the most efficient at run time, but it is very easy to code.

    Reply
  • 1. Kansas4444 asks: Hi Brent, do you have any advice on calculating checksum for an entire row on SQL Server ? (Brent says: I’ve never had to do this, so no idea.)

    SELECT CHECKSUM(*)
    FROM table_name

    SELECT CHECKSUM(column_name_1, column_name_2, …)
    FROM table_name

    Reply
  • 5. You can use third party software like Symantec’s Privileged Access Manager (PAM for short.) You can use MFA to access PAM and check out user admin accounts for your SQL boxes. The timeout for the password can be set to a time of your choosing and puts a layer between the admin users and SQL Server. Better have a really good DR plan for PAM though.

    Reply
    • Isn’t PAM just for OS level users that are local admin on VMs (pretty similar to Thycotic)?
      I think the question was for MFA supprt when authenticating to the SQL Server instance (either via SQL authentication or Windows authentication), not on the host on which the instance resides.

      Reply
      • The idea is the OS level users don’t have direct access to SQL or the VMs without going through PAM. I’ve never heard of Thycotic so I can’t compare. Each DBA will have their domain account and a domain PAM account. Create an AD group of only DBA’s grant that group access to SQL and the VMs, have two factor auth to PAM and there is your layer of separation of user accounts. Although it will be required to copy the password from PAM and perform a ‘run as’ to use SSMS directly to the instance, it provides a higher level of security than out of the box SQL because PAM controls the password, periodic password changes and complexity. Granted this is not perfect, but if the intent is to separate DBA user accounts from SQL, it fits the bill.

        Reply
  • 9. Our server admin team uses System Center Configuration Manager (now called Microsoft Endpoint Configuration Manager, I believe) to automatically push OS updates to the SQL servers. They defined an AD Group for active SQL servers that SCCM will need to explicitly exclude and the DBA team uses PowerShell cmdlets (Get-ADObject, Get-ADComputer, Get-ADGroup, Remove-ADGroupMember, and Add-ADGroupMember) to populate that group with the AD Computer objects for all active cluster nodes.

    All passive nodes are patched in the week preceding the monthly maintenance downtime, the downtime window is used to failover cluster instances and availability groups since any failover risks transaction rollbacks, and then the formerly-active-but-now-passive nodes are patched in a subsequent week.

    Although you could script the failovers via PowerShell if you wanted to automate the entire process end-to-end, we prefer to have a DBA involved in planned service disruptions for additional safety and validation.

    Reply
  • 1. this is how I calculate hash values for whole rows very successful ever since:

    select
    ht.id,
    ht.HashValue
    FROM
    (select c.Id Id,
    HASHBYTES(‘SHA1’,(select customers.* FROM (values(null))foo(bar) for xml auto)) HashValue FROM customers c
    ) ht

    Reply
    • small correction:
      select
      ht.id,
      ht.HashValue
      FROM
      (select c.Id Id,
      HASHBYTES(‘SHA1’,(select c.* FROM (values(null))foo(bar) for xml auto)) HashValue FROM customers c
      ) ht

      Reply
  • @HotPotatoJuggler – Your DBA is correct. The database isn’t your bottleneck. You seem to be benchmarking the response time of a webpage from inside the browser’s built-in dev tools. You answered your own question by finding that “.NET adapter” (sic) is making *TWENTY* db queries, each of which spends about one third of a second. This is your target. Drill deeper here.

    Bonus Tip: Bone up on the following SQL syntax to get the 20 rows you’re after in a single query (if you’re causing your own problem by making 20 db queries in a loop)

    SELECT … ORDER BY … OFFSET x ROWS FETCH NEXT y ROWS ONLY

    Hope this helps! Cheers!

    Reply
  • TDE prevents instant file initialization, this affects DB data file growths, so size the DB correctly and ensure growths of an appropriate size are done at appropriate times to minimize impact.

    https://learn.microsoft.com/en-us/sql/relational-databases/databases/database-instant-file-initialization?view=sql-server-ver16

    Reply
  • Frank the Datenpunk
    September 19, 2023 10:05 pm

    Question 9
    Makes sure that primary/secondary node never run on same host. CAU takes over the rest of the Work!

    Reply
  • 9. There is a cluster aware updating powershell module, perhaps that can help
    https://learn.microsoft.com/en-us/powershell/module/clusterawareupdating/?view=windowsserver2022-ps

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.